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the following statement: 




-2- 

The present invention concerns novel nucleotide sequences encoding insecticidal proteins 
from the Enterobacteriaceae, Serratia entomophila and Serratia proteamaculans, and the 
use of said nucleotide sequences and insecticidal proteins. 



BACKGROUND 



Some Serratia entomophila and Serratia proteamaculans strains in New Zealand are 
known to cause a disease in the major scarab pest, Costelytra zealandica (New Zealand 
grass grub). The disease was first discovered and described by Trought and Jackson 
(1982) and was later named amber disease after the distinctive colour of affected insects 
(Stucki et al. 1984). One species capable of causing the disease, Serratia entomophila^ 
was developed into a commercially-available product ("Invade") in 1989. 

The disease is highly host specific, only known to infect a single indigenous species of 
New Zealand scarab larva. The disease appears unique among insects and results not from 
rapid invasion of the haemocoel, but from a slow colonisation of the gut. The disease has 
a distinct phenotypic progression, with infected hosts ceasing feeding within 2-5 days of 
ingesting pathogenic cells. The normally blacked gut clears around this time (Jackson et 
al. 1993) and the levels of the major gut digestive enzymes (trypsin etc) decreases sharply 
(Jackson, 1995). The clearance of the gut results in a characteristic amber colour of the 
infected hosts. The larvae may remain in this state for a prolonged period (1-3 months) 
before bacteria eventually invade the haemocoel, causing rapid death. 

The finding of a plasmid which apparently encoded the disease was reported in Glare et 
al. (199?) by showing a correlation between pADAP presence and disease occurrence in 
bacterial strains. This was further confirmed by Glare et al. (1996) who showed that 
transfer of the plasmid from pathogenic to non-pathogenic strains resulted in a change to 
pathogenic. 



Grkovic et al. (1995) showed that disruption of the plasmid by transposon insertion could 
alter pathogenicity, without fully defining the area containing the gene casette. By marker 
exchange, they showed that a 10.5kb //z>zdIII(pGLA20) construct from pADAP encoded 
some functions of amber disease, however the clone did not contain all disease encoding 
plasmid-borne regions. 



Another region which is involved in amber disease encoding was located by Nunez- 
Valdez and Mahanty (1996). They located a locus, amb2, by transposon mutagensis and 
;osmid-genomiG4ibraiy^-^ 



•K 



-3- 

involved in antifeeding in the larvae of Costelytra zealandica. However, the current 
applicants research has demonstrated that the ambl region is located on pADAP remote 
from the virulence genes and is probably regulatory in function. 

Insecticidal toxins which share some protein homology to the Serratia insecticidal 
proteins of the present invention have been recently discovered (PCT/US96/ 18003; 
PCT/US97/07657) by a group at Wisconsin University (Blackburn et al. 1998; Bowen et 
al. 1998; Bowen and Ensign 1998). These insecticidal toxins are produced from a gene 
region in Photorhabdus luminescens which resembles the Serratia virulence region in the 
clustering of the genes and at the protein level, but has very little DNA homology with the 
Serratia genes. They have shown that high molecular weight proteins from Photorhabdus 
luminescens are insecticidal to a number of insects from different orders. The lack of 
DNA homology over the majority of the region, as opposed to protein homology, between 
the Serratia genes and Photorhabdus genes suggests that these proteins have evolved as 
a result of convergent evolution leading to the formation of a distinct protein family with 
a common function. 

The present applicant has now found that three regions of the pADAP plasmid are 
required for full insecticidal function. Sequence analysis of these three regions has shown 
that the present applicants have isolated and identified a novel toxin from Serratia sp 
which belongs to a new family of insecticidal toxins. It is broadly to this toxin that the 
present invention is directed. , . 

SUMMARY OF THE INVENTION 

t) } 

According lo a firsi aspen of the present invention, there is provided an isolated nucleic 

acid molecule comprising a nucleotide sequence of SEQ ID NO: 1 which encodes an 
insecticidal protein complex, or a functional fragment, neutral mutation, or homolog 

thereof capable , of hy bri di s i n g w i t h -said— nucleic— acid-^nelcc ulc under standard- 

hybridisation conditions. 

The invention also provides an isolated nucleic acid molecule comprising the nucleotide 
sequence 1955-18937 of SEQ ID NO: 1 which encodes an insecticidal protein complex, 
or a functional fragment, neutral mutation, or homolog thereof capable of hybridising 
with said nucleic acid molecule under standard hybridisation conditions. 

The invention also provides an isolated nucleic acid molecule comprising one or more of 
the nucleotid e i>i*qnencje£J241^ of SEQ ID N O : 1 - 



which encode insecticidal proteins, or a functional fragment, neutral mutation, or 
homolog thereof capable of hybridising with said nucleic acid molecule under standard 
hybridisation conditions. 

Preferably the nucleic acid molecule comprises all of nucleotide sequences 241 1-9547, 
9598-13884 and 14546-17467 of SEQ ID NO: 1. 

The invention further relates to an isolated nucleic acid molecule comprising a sequence 
of SEQ ID NO: 1, nucleotides 1955-18937 of SEQ ID NO: 1 or one or more of 
nucleotides 2411-9547, 9598-13884 or 14546-17467 of SEQ ID NO: 1, operably linked 
to at least one further nucleotide sequence which encode an insecticidal protein. For 
example, the at least one further nucleotide sequence may be the nucleotide sequence 
which codes for the Bacillus delta endo toxins, vegatative insecticidal proteins (vips), 
cholesterol oxidases, Clostridium bifermentens mosquitocidal toxins and/or 
Photorhabadus luminescens toxins etc. 

The nucleic acid molecule may comprise DNA, cDNA or RNA. 

Preferably said fragment, neutral mutation or homolog thereof is capable of hybridising 
to said nucleic acid molecule under stringent hybridisation conditions. 

The invention further relates to nucelic acid molecules which hybridise to the nucleotide 
sequence of SEQ ID NO: 1, or nucleotides 1955-18937, 2411-9547, 9598-13884 or 
14546-17467 of SEQ ID NO: 1 if there is at least 50%, preferably 60% ? more preferably 

70% and most preferably 90-95% or greatei i dentin' benveen the sequences. 

The nucleic acid molecule may be isolated from Serratia entomophila or Serratia 
proteamaculans strains. 



Also provided by the present invention are recombinant expression vectors containing the 
nucleic acid molecule of the invention and hosts transformed with the vector of the 
invention capable of expressing a polypeptide of the invention. 

The vector may be selected from any suitable natural or artificial plasmid/vector. For 
example. pUC 19 (Yannish-Perron et al. 1995), pProEX HT (GibcoBRL, Gaithersburg, 
MD, USA), pBR322 (Bolivar et al. 1977), pACYC184 (Chang et al. 1978), pLAFR3 
(Staskowicz et al. 1987), etc. 
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In a further aspect, the invention provides a method of producing a polypeptide of the 
invention comprising the steps of: 

(a) culturing a host cell which has been transformed or transfected with a vector as 
defined above to express the encoded polypeptide or peptide; and 

(b) recovering the expressed polypeptide or peptide. 

An additional aspect of the present invention provides a ligand that binds to a polypeptide 
of the invention. Most usually, the ligand is an antibody or antibody binding fragment. 
Such ligands also form a part of this invention. 

According to a further aspect of the present invention there are provided probes and 
primers comprising a fragment of the nucleic acid molecule of the invention capable of 
hybridising under stringent conditions to a native insecticidal gene sequence. Such 
probes and primers are useful, for example, in studying the structure and function of this 
novel gene and for obtaining homologs of the gene from bacteria other than Serratia sp. 

According to a still further aspect of the present invention there is provided a polypeptide 
having insecticidal activity encoded by the nucleic acid molecule of the invention, or a 
functional fragment, neutral mutation or homolog thereof. 

The polypeptide may comprise the amino acid sequence of SEQ ID NO: 1 or a functional 
fragment, neutral mutation of homolog thereof. 

\ 

The polypeptide may comprise ammo acids 32-51 18 of SEQ ID NO: 1 

The polypeptide may comprise at least one amino acid sequence of SEQ ID NO: 2; 
"SEQ ID NO . 3, SEQTD NO: 4; SEQ^D^O^^rSEQ^^XJTS: 

Preferably the polypeptide comprises amino acid sequence SEQ ID NO: 4; SEQ ID NO: 5 
and SEQ ID NO: 6. 

More preferably the polypeptide comprises all of SEQ ID NOs: 2-6. 

Conveniently, the polypeptide of the invention is obtained by expression of a DNA 
sequence coding therefore in a host cell or organism. 




The polypeptide may comprise the amino acid sequence of SEQ ID NO: 1 linked to at 
t least one further amino acid sequence encoding an insecticidal protein. For example, the 
at least one further amino acid sequence may be the amino acid sequence which codes for 
Bacillus delta endo toxins, vegetative insecticidal proteins (vips), cholesterol oxidases, 
Clostridium bifermentens mosquitocidal toxins and/or Photorhabadus luminescens toxins 
etc. 

The invention further relates to polypeptides comprising at least 50%, preferably 60%, 
more preferably 70% and most preferably 90-95% or greater identity to SEQ ID NO 1 . 

The polypeptide may be produced by expression of a vector comprising the nucleic acid 
molecule of the invention or a functional fragment, neutral mutation or homolog thereof, 
in a suitable host cell. 

According to a further aspect, there is provided an insecticidal composition comprising 
at least the polypeptide of the invention and an agriculturally acceptable carrier such as 
would be known to a person skilled in the art. More than one polypeptide of the invention 
can of course, be included in the composition. In addition, the composition can comprise 
one or more additional pesticides, for example, compounds known to possess herbicidal, 
fungicidal, insecticidal, arcaricidal or nematicidal activity. 

The composition may further comprise other known insecticidally active agents, such as 
Bacillus delta endo toxins, vegetative insecticidal proteins (vips), cholesterol oxidases, 

Clostridium bifermentens mosquitocidal toxins and/or Photorhabadus luminescens toxins 
eic. 

According to a further aspect, there is provided a method of combatting pests, especially 
insects at a locus or host for the pest infested with or liable to be infested therewith, said 
me t hod co mp rising app l ying to a loc u s , ho st an d/o r t he pest, an effective a mount ot the 
polypeptide of the invention that has functional insecticidal activity against said pest. 

According to a further aspect the invention provides a method of inducing amber disease 
or like condition in insects comprising delivery to an insect an effective amount of the 
polypeptide of the invention that has functional insecticidal activity against said insect. 

The insect may be selected from the order comprising Coleoptera (such as the black 
beetle, Heteronychus arator (F.), or the black vine weevil, Otiorhynchus sulcatus (F.)); 
-Dicty^ptera^-eg^I^e^ (t^); or the subterranean 




termite Coptotermes spp.); Diptera (eg. the housefly Musca domestica L. or the blowfly 
Lucillia cuprina (Wiedemann); Orthoptera (eg. The black field cricket Telleogryllus 
commodus (Walker) or the migratory locust Locusta migratoria L.); Hymenoptera (eg. 
The German wasp, Vespula germanica (F.)); Hemiptera (such as the green vegetable bug 
Nezara viridula (L.) or the green peach aphid Myzus persicae (Sulzer)) the Lepidoptera 
(eg. the tomato fruitworm, Helicoverpa armigera (Walker), or the codling moth, 
Laspeyresia pomonella (L.)). 

The insecticidal polypeptide may be delivered to the insect orally either as a solid bait 
matrix, as a sprayable insecticide sprayed onto a substrate upon which the insect feeds, 
applied directly to the soil subsurface or as a drench or is expressed in a transgenic plant, 
bacterium, virus or fungus upon which the insect feeds, or by any other suitable method 
which would be obvious to a person skilled in the art. 

According to a further aspect, the invention provides a transgenic plant, bacterium, virus 
or fungus, incorporating in its genome, a nucleic acid molecule of the invention providing 
the plant, bacterium, virus or fungus with an ability to express an effective amount of an 
insecticidal polypeptide. 

The invention will be further defined by reference to the specification and the following 
examples and figures herein. 

Figure 1 shows restriction maps of clones used to isolate the pathogenic region and maps 
of the two pathogenic variants pMH32 and pMH41. (A) The pADAP Hindlll clone 
pGLA-20 showing locations of the pGLA-20 mutations -10, -13. and -35. which when 
recombined back into pADAP and bioassayed agamst grass grub, result in either a 
pathogenic phenotype, shown by full flag, or a healthy but non-feeding phenotype 
indicated by half filled flag. Map of pBG35 showing relative position of pGLA-20-35 
mutation and the loc ation of the 2 2kb EcoRI n*e?Aj*<^jpxx&*^ fi amJH - 
library. (B) Restriction enzyme maps of the pathogenic clones pMH32 and pMH41, area 
of deletion is indicated by a. E33 pBR322 vector DNA; tm pLAFR3 vector 
DNA. Restriction enzymes are abbreviated as follows: B, BamHl, Bg, BgUl; E, EcoRI; 
H, Hindlll; and X,Xbal. 

Figure 2 shows (A) Mini-Tn70 pACYCl84 based deletion derivatives used in study. 
mUD pACYC184 vector, a indicates deletion + pathogenic, - loss of pathogenicity. 
(B) Restriction maps of the mutated constructs pBM32 and the pADK recombinants (C). 
The phenoty pe of each mutantisincticated-by-^ 
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did not affect the disease process. Open flags indicate mutations that abolish disease 
symptoms, half filled flags denote mutations that abolish visual disease symptoms but are 
unable to feed. * indicates pADK mutations obtained by Grkovic et al. (1995). Restriction 
enzymes are abbreviated as follows: B, BamHI, Bg, BgKl; E, EcoKL; H, HindUI; and 
X, Xbal. (D) Schematic diagram of the sequenced region, ^fifl Denotes sequenced 
region. Arrows indicate ORFs and their direction E33 ; region homologous to spvB 
.. location of repeat. (E) nucleotide sequence of the 5 times 12bp repeat and the 
palindrome. Restriction enzymes are abbreviated as follows: B, BamHl, Bg, BgRl\ E, 
EcoKL; H, Hindlll; and X, Xbal.. 

Figure 3 shows hydrophobicity plots of SepC and its closest homologue TccC. The scale 
is disproportional to size and has a scanning window of 17 amino-acid residues. 

Figure 4 shows the comparison of protein sequences of the SepA and P. luminescens 
toxins, TcdA, TcaB and TccB Putative RGD motif is boxed. The site of proteolytic 
cleavage as reported by Bowen et al. (1998) (Residue 1933 of TcdA) is indicated by an 
arrow. 

Figure 5 shows the comparison of protein sequences of the SepC and P. luminescens 
toxin TccC. 

Figure 6 shows the plasmid pADAP. 

DETAILED DESCRIPTION OF THE INVENTION 

1. DEFINITIONS AND METHODS 

The following definitions and methods are provided to better define the present invention 

and to guide t hosg_ of ordi na ry s kill i n the a rt in th e^racti ce of the p resent in v ention. 

Definitions of common terms in molecular biology may also be found in Lewin, Genes V, 
Oxford University Press: New York, 1994. 

The term "native" refers to a naturally-occurring nucleic acid or polypeptide, including, 
wild-type sequence and alleles thereof. 

A "homolog" has at least one of the biological activities of the nucleic acid or polypeptide 
_Qfj^e_inYentLori_an^ 
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sequence thereto, preferably 75%-85% and most preferably 90-95% identical amino acid 
or nucleic acid sequence thereto. 

The term "neutral mutation" means a mutation, ie a change in the nucleotide or 
polypeptide sequence such as by deletion, substitution, inversion or insertion, which have 
no effect on the function of the encoded protein. 

As indicated above, also possible are variants of the polypeptide or peptide which differ 
from the native amino acid sequence by insertion, substitution or deletion of one or more 
amino acids. Where such a variant is desired, the nucleotide sequence of the native DNA 
is altered appropriately. This alteration can be made through elective synthesis of the 
DNA or by modification of the native DNA by, for example, site-specific or cassette 
mutagenesis. Preferably, where portions of cDNA or genomic DNA require sequence 
modifications, site-specific primer directed mutagenesis is employed using techniques 
standard in the art. 

In a further aspect, the present invention consists in replicable transfer vector suitable for 
use in preparing a polypeptide of the invention. These vectors may be constructed 
according to techniques well known in the art, or may be selected from cloning vectors 
available in the art. 

The cloning vector may be selected according to the host or host cell to be used. Useful 
vectors will generally have the following characteristics: 

> fa) the ability to self-replicate; 

(b) the possession of a single target for any particular restriction endonuclease; and 

(c) desirably, carry genes for a readily selectable marker such as antibiotic resistance. 

Twe^naj^f~*ypes-ol^ t hese ch ar a cle ri stics^re plasmids and bacterial 

viruses (bacteriophages or phages). Presently preferred vectors include plasmids pMOS- 
Blue, pGem-T and pUC8. 

The nucleic acids of the present invention can be free in solution, or attached by 
conventional means to a solid support, or present in an expression vector or any other type 
or plasmid. 

The term "isolated" means substantially separated or purified away from contaminating 
sequenees^^e^efror-e^anism4n-^ occurs and includes 
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nucleic acids purified by standard purification techniques as well as nucleic acids prepared 
by recombinant technology and those chemically synthesised. 

The term "DNA construct" means a construct incorporating the nucleic acid molecule of 
the present invention, or a fractional fragment, neutral mutation or homolog thereof in a 
position whereby the protein coding sequence is under the control of an operably linked 
promoter capable of expression in a plant cell. Such promoters are well known in the art. 

A fragment of a nucleic acid molecule according to the present invention is a portion of 
the nucleic acid that is less than full length and comprises at least a minimum length 
capable of hybridising specifically with a nucleic acid molecule according to the present 
invention (or a sequence complementary thereto) under stringent conditions as defined 
below. A fragment according to the present invention has at least one of the biological 
activities of the nucleic acid or polypeptide of the present invention. 

Nucleic acid probes and primers can be prepared based on nucleic acids according to the 
present invention eg the sequence of SEQ ID NO: 1. A "probe" comprises an isolated 
nucleic acid attached to a detectable label or reporter molecule well known in the art. 
Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and 
enzymes. 

"Primers" are short nucleic acids, preferably DNA oligonucleotides 1 5 nucleotides or 
more in length, which are annealed to a complementary target DNA strand by nucleic acid 
hybridization to form a hybrid between the primer and the target DNA strand, then 
expended along the largei DNA strand by a polymerase, preferably a DNA polymerase. 
Primer pairs can be used for amplification of a nucleic acid sequence, eg by the 
polymerase chain reaction (PCR) or other nucleic acid amplification methods well known 
in the art. PCT-primer pairs can be derived from the sequence of a nucleic acid according 
-to the present invention, for example, by using computer programs intended tor that 
purpose such as Primer (Version 0.5^ 1991, Whitehead Institute for Biomedical Research, 
Cambridge, MA). 

Methods for preparing and using probes and primers are described, for example, in 
Sambrook et al. Molecular Cloning: A Laboratory Manual, 2nd ed, vol. 1-3, ed Sambrook 
et al. Cold Spring Harbour Laboratory Press, Cold Spring Harbour, NY, 1989. 

Probes or primers can be free in solution or covalently or noncovalently attached to a solid 
"suppurt"by3tandHfd means. 



The term "operably linked" means a first nucleic acid sequence linked to a second nucleic 
acid sequence when the first nucleic acid sequence is placed in a functional relationship 
with the second nucleic acid sequence. For instance, a promoter is operably linked to a 
coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Generally, operably linked DNA sequences are contiguous and, where 
necessary to join two protein coding regions, in reading frame. 

The DNA molecules of the invention may be expressed by placing them in operable 
linkage with suitable control sequences in a replicable expression vector. Control 
sequences may include origins of replication, a promoter, enhancer and transcriptional 
terminator sequences amongst others. The selection of the control sequence to be 
included in the expression vector is dependent on the type of host or host cell intended to 
be used for expressing the DNA. 

A "recombinant" nucleic acid is one that has a sequence that is not naturally occurring or 
has a sequence that is made by an artificial combination of two otherwise separated 
segments of sequence. This artificial combination is often accomplished by chemical 
synthesis or, more commonly, by the artificial manipulation of isolated segments of 
nucleic acids, eg, by genetic engineering techniques. 

Techniques for nucleic acid manipulation are described generally in, for example, 
Sambrook et al. (1989).. 

Large amounts of a nucleic acid according to the present invention can be produced by 

recombinant means we]] known in the an or by chemical synthesis. 

Natural or synthetic nucleic acids according to the present invention can be incorporated 
into recombinant nucleic acid constructs, typically DNA constructs, capable of 
4nt roduction into and r eplicati o n in a host cell. — Usually the DNA constructs will be 
suitable for replication in a unicellular host, such as E. coli or other commonly used 
bacteria, but can also be introduced into yeast, mammalian, plant or other eukaryotic cells. 

Preferably, such a nucleic acid construct is a vector comprising a replication system 
recognized by the host. For the practice of the present invention, well known 
compositions and techniques for preparing and using vectors, host cells, introduction of 
vectors into host cells, etc, are employed, as discussed, inter alia, in Sambrook et al. 
(1989). 
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A cell, tissue, organ, or organism into which has been introduced a foreign nucleic acid, 
such as a recombinant vector, is considered "transformed" or "transgenic". The DNA 
construct comprising a DNA sequence according to the present invention that is present 
in a transgenic host cell, particularly a transgenic plant, is referred to as a "transgene." 
The term "transgenic" or "transformed" when referring to a cell or organism, also includes 
(1) progeny of the cell or organism and (2) plants produced from a breeding program 
employing such a "transgenic" plant as a parent in a cross and exhibiting an altered 
phenotype resulting from the presence of the recombinant DNA construct. 

Generally, procaryotic, yeast, insect or mammalian cells are useful hosts. Also included 
within the term hosts are plasmid vectors. Suitable procaryotic hosts include E. coli, 
Bacillus species and various species of Pseudomonas . Commonly used promoters such 
as P-lactamase (penicillinase) and lactose (lac) promoter systems are all well known in the 
art. Any available promoter system compatible with the host of choice can be used. 
Vectors used in yeast are also available and well known. A suitable example is the 2 
micron origin of replication plasmid. 

Similarly, vectors for use in mammalian cells are also well known. Such vectors include 
well known derivatives of SV-40, adenovirus, retrovirus-derived DNA sequences, Herpes 
simplex viruses, and vectors derived from a combination of plasmid and phage DNA. 

Further eucaryotic expression vectors are known in the art (e.g. PJ. Southern and P. Berg, 
J. MoL AppL Genet 1 327-341 (1982); S. Subramani et al., MoL Cell Biol. 7,854- 
864 (1981); R. J. Kaufrnann and P. A. Sharp : "Amplification and Expression of 
Sequences Cotransfecied viih a Modulas Dihydrofolate Reducase Complememars DNA 
Gene, J. MoL Biol. 159, 601-621 (1982); R.J. Kaufrnann and P. A. Sharp, MoL Cell. 
Biol 159, 601-664 (1982); S.I. Scahill et al., "Expressions And Characterization Of The 
Product Of A Human Immune Interferon DNA Gene In Chinese Hamster Ovary Cells," 
-Proe^Natl^-Aead^-Sci. US A . 8 0 , 4654-4659 ( 19 83 ) ; G. Urlaub ^aiTd-ttT Ghasin, Proc. 
Natl. Acad. Sci. USA. 77,4216-4220,(1980). 

The expression vectors useful in the present invention contain at least one expression 
control sequence that is operatively linked to the DNA sequence or fragment to be 
expressed. The control sequence is inserted in the vector in order to control and to 
regulate the expression of the cloned DNA sequence. Examples of useful expression 
control sequences are the lac system, the Hp system, the tac system, the trc system, major 
operator and promoter regions of phage lambda, the glycolytic promoters of yeast acid 
-^hesphatase7-e7gr^ho57-the-promoters-of the yeasralpha=mating factors, and promoters" 
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derived from polyoma, adenovirus, retrovirus, and simian virus, e.g. the early and late 
promoters of SV40, and other sequences known to control the expression of genes of 
prokaryotic and eucaryotic cells and their viruses or combinations thereof. 

In the construction of a vector it is also an advantage to be able to distinguish the vector 
incorporating the foreign DNA from unmodified vectors by a convenient and rapid assay. 
Reporter systems useful in such assays include reporter genes, and other detectable labels 
which produce measurable colour changes, antibiotic resistance and the like. In one 
preferred vector, the "P-galactosidase reporter gene is used, which gene is detectable by 
clones exhibiting a blue phenotype on X-gal plates. This facilitates selection. In one 
embodiment, the p-galactosidase gene may be replaced by a polyhedrin-encoding gene; 
which gene is detectable by clones exhibiting a white phenotype when stained with X-gal. 

This blue- white colour selection can serve as a useful marker for detecting recombinant 
vectors. 

Once selected, the vectors may be isolated from the culture using routine procedures such 
as freeze-thaw extraction followed by purification. 

For expression, vectors containing the DNA of the invention to be expressed and control 
signals are inserted or transformed into a host or host cell. Some useful expression host 
cells include well-known prokaryotic and eucaryotic cells. Some suitable prokaryotic 
hosts include, for example, Ksoli, such as Exoli, S G-936, E.coli HB 101, E.coli W31 10, 
E*££li X1776, E.coli, X2282, E.coli, DHT and E.coli, MR01, Pseudomonas : Bacillus, such 
as Bacillus subtilis and Streptomyces. Suitable eucaryotic cells include yeast and other 
fungi, insecx, animal cells, such as COS cells and CHO cells, human cells and plant cells 
in tissue culture. 

^D epen d ing on the host us edrfransfo rmation is ^perfbrme d "acco r d i ng to s t a ndard techniques 
appropriate to such cells. For prokaryotes or other cells that contain substantial cells 
walls, the calcium treatment process (Cohen, S N Proceedings, National Academy of 
Science, USA 69. 21 10 (1972)) may be employed. For mammalian cells without such cell 
walls the calcium phosphate precipitation method of Graeme and Van Der Eb, Virology 
52:546 (1978) is preferred. Transformations into plants may be carried out using 
Agrohacterium tumefaciens (Shaw et al., Gene 23:315 (1983) or into yeast according to 
the method of Van Solingen et al. J.Bact 130: 946 (1977) and Hsiao et al. Proceedings, 
National Academy of Science, 76: 3829 (1979). 
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Upon transformation of the selected host with an appropriate vector the polypeptide or 
peptide encoded can be produced, often in the form of fusion protein, by culturing the host 
cells. The polypeptide or peptide of the invention may be detected by rapid assays as 
indicated above. The polypeptide or peptide is then recovered and purified as necessary. 
Recovery and purification can be achieved using any of those procedures known in the art, 
for example by absorption onto the elution from an anion exchange resin. This method 
of producing a polypeptide or peptide of the invention constitutes a further aspect of the 
present invention. 

Host cells transformed with the vectors of the invention also form a further aspect of the 
present invention. 

Methods for chemical synthesis of nucleic acids are well known and can be performed, 
for example, on commercial automated oligonucleotide synthesizers. 

The term "stringent conditions" is functionally defined with regard to the hybridization 
of a nucleic acid probe to a target nucleic acid (ie to a particular nucleic acid sequence of 
interest) by the hybridization procedure discussed in Sambrook et al. (1989) at 9.52-9.55 
and 9.56-9.58. 

Regarding the amplification of a target nucleic acid sequence (eg by PCR) using a 
particular amplification primer pair, stringent conditions are conditions that permit the 
primer pair to hybridize only to the target nucleic acid sequence to which a primer having 
the corresponding wild type sequence (or its complement) would bind. 

Nucleic acid hybridization is affected by such conditions as salt concentration, 
temperature, or organic solvents, in addition to the base composition, length of the 
complementary strands, and the number of nucleotide base mismatches between the 
hybridizing nucleic acids._asj^ilLbe readi^ i n t he a il. 

When referring to a probe or primer, the term "specific for (a target sequence)" indicates 
that the probe or primer hybridizes under stringent conditions only to the target sequence 
in a given sample comprising the target sequence. 

The term "protein (or polypeptide)" refers to a protein encoded by the nucleic acid 
molecule of the invention including fragment, mutations and homologs having the same 
biological activity ie insecticidal activity. The polypeptide of the invention can be isolated 
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from a natural source, produced by the expression of a recombinant nucleic acid molecule 
or be chemically synthesized. 

Peptides having substantial sequence identity to the above-mentioned peptides can also 
be employed in preferred embodiments. Here, "substantial sequence identity" means that 
two peptide sequences, when optimally aligned, such as by the programs GAP or 
BESTFIT using default gap weights, share at least 80 percent sequence identity, 
preferably at least 90 percent sequence identity, more preferably at least 95 percent 
sequence identity or more. Preferably, residue positions which are not identical differ by 
conservative amino acid substitutions. For example, the substitution of amino acids 
having similar chemical properties such as charge or polarity are not likely to effect the 
properties of a protein. Examples include glutamine for asparagine or glutamic acid for 
aspartic acid. 

PROTOCOL 

Bacterial isolates and methods of culture 

Table 1 lists bacterial isolates and plasmids used in the present invention. Bacteria were 
grown in LB broth or on LB agar (Sambrook et al. 1 989), at 37°C for Escherichia coli and 
30°C for S. entomophila. Antibiotic concentrations used (jag/ml) for Serratia were 
kanamycin 100, chloramphenicol 90, tetracycline 30 and for E. coli strains were 
kanamycin 50, chloramphenicol 30, tetracycline 15, and ampicillin 100. 

DNA isolation and manipulations 

pADAP DNA was isolated from a 50ml overnight culture of bacteria using Q1AGEN® 
plasmid maxi kii (Qiagen, Hilden, Germany), as per the manufacturers instructions. 
Standard DNA techniques were carried out as described by Sambrook et al. (1989). 
Radioactive probes were made using the Amersham Megaprime DNA labelling system 
(A m e rs hamT Buckingh amshire- , UK). Sou thenrandxoloiiyiiybridisations were performed 
as outlined in Sambrook et al. (1989). The plasmid pADAP is shown in Figure 6. 

pADAP BamHl library was constructed using a Sigma 'Gigapack®IIIXL packaging 
extract, as specified by the manufacturer (Stratagene, California, USA). 



Introduction of plasmid DNA into E.coli and S. entomophilia 

pLAFR3 based derivatives were introduced into S. entomophilia by tripartite matings on 
solid media as described previously (Finnegan & Sheratt, 1982) using the pRK2013 helper 
ridr^Ftgra&i^H plasmid s were 
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electroporated into E.coli and S. entomophilia strains, using a Biorad Gene Pulser (2fiF, 
2.5KV, and 200 abns) (Dower et al. 1988). 

Mutagenesis 

Transposon insertions were generated in recombinant plasmids using the mini-ZW/O 
derivative 103 (kanamycin resistant) as described by Kleckner et al. (1991). Insertions 
were recombined into pADAP by transforming A1M02 (refer to Table 1) with the desired 
construct. After growth in non-selective media, bacteria were screened for resistance to 
kanamycin and loss of the pLAFR3 tetracycline resistance marker. 

Bioassay against Costelytra zealandica larvae 

Infection of C. zealandica larvae was determined by a standard bioassay where the healthy 
larvae, collected from the field, were individually fed squares of carrot which had been 
rolled in colonies of bacteria grown overnight on solid media (resulting in approximately 
10 5 cells/carrot square). Twelve second or third instar larvae were used for each 
treatment. Inoculated larvae were maintained at 15°C, in ice-cube trays. Larvae were left 
feeding on treated carrot for 3-4 days, then transferred to fresh trays and provided with 
untreated carrot for 10-14 days. The occurrence of gut clearance and loss of feeding was 
recorded every 3-4 days. Strains were considered disease-causing if greater than 70% of 
larvae showed disease symptoms by day 14. Known pathogenic and non pathogenic 
controls were included in all bioassays. Typically cessation of feeding occurs within 2-3 
days while clearance of the larvae gut may take 4-6 days. 

Recovery of bacteria from larvae 

1 o isolate bacieria from inoculated grubs, larvae were surface sterilised by submerging 
in 70% methanol for 30 seconds. The larvae were then shaken in sterile DH 2 0, removed 
and individually macerated in a 1.5ml microcentrifuge tube. The macerate was serial 
•diluted and plated on LB media containing antibiotics selective for the host S. 
entomophilia strain. To assess the stability of the bioassayed plasmid, colonies were 
patched onto a plate containing antibiotics either selective for the recombinant plasmid or 
the S. entomophilia strain. Identity of plasmids in the recovered strain was checked by 
restriction enzyme profile. 

Nucleotide Sequencing 

A 9-kb BamRl -EcdRl fragment derived from the pBM32-8 mutation (Fig 2b) and the 8kb 
Hindlll fragment of pBM32 were separately cloned into the appropriate site of the 
deletion factory plasmid pDELTAl. Deletions were generated using the Deletion 
"factory™ system (GIBCO BRL, MD, USA), as outlined in the manufacturers instructions. 




To identify the precise location of mini-Tn70 mutations, the peripheral mini-TnlO BamHI 
sites were used in conjunction with the BamHI sites of the pathogenic region to subclone 
the mini-Tn70 flanking regions into either pACYC184 or pUC19. Sequences were 
generated using the mini-Tn/0 specific primer 5 ' ATG AC AAG ATGTGT ATCC ACC3 * 
(Kleckner et al. 1991). 

Plasmids for sequencing were prepared by Wizard® (Promega, Madison, USA) or 
Quantum Prep® (Bio-Rad, California, USA) miniprep kits. Sequences were determined 
on both strands, by using combinations of subcloned fragments, custom primers and 
deletion products derived from the deletion factory system (Gibco BRL, Madison, USA). 
The DNA was sequenced using either 33 P dCTP and the Thermosequenase cycle 
sequencing kit (Amersham, Buckinghamshire, UK), or by automated sequencing using 
an Applied Biosystem 373A or 377 autosequencer. Sequence data were assembled using 
SEQMAN(DNASTAR Inc, Madison, USA). ORFs were analysed by Gene Jockey. 
Databases at the National Center for Biotechnology Information were searched by using 
BLASTN and BLASTX via the www.ncbi.nlm.gov/BLAST. Searches for DNA 
palindromes, repeats and inverted repeats were undertaken using DNAMAN (Lynnon 
Biosoft, Quebec, Canada). Protein motifs were searched using Blocks 
(http: //www, blocks, fhcrc.org/), ExPASy (http://www.expasy.ch/), and Gene Quiz 
(http://columba. ebi. ac. uk: 8 765/gqsrv/subm it). 

The sequences determined in this study have been deposited in Gene Bank under 
accession number AF1335182. 

RESULTS 

Cloning the disease encoding region from pADAP 

Previously, Grkovic et al. (1995) have shown that the pADK-13 mutation can be 
-c omplem en t e d with th e^pABA P 1 1 kb Hin dill fragment (pGLA-20X However the 
pADK-10 mutation was unable to be complemented with pGLA-20. In an attempt to 
isolate the region that may complement the pADK-10 mutation the previously described 
pGLA-20 derived, pADK-35 null mutation (Grkovic et al. 1995) was used as a selective 
marker (Fig 1), to select the Bglil fragment encompassing both the pADK-10 and 
pADK-35 mutations. pADK-35 DNA was isolated and digested with the restriction 
enzyme Bglil. The resultant digest was ligated into the BamHI site of pBR322 to form 
the construct pBG3 5 (containing 12.8kb Bglil - mini-Tn/0 fragment). pBG35 was placed 
separately in trans with pADK-10 and pGLA-20, and the resultant strains bioassayed 
^gaifls^-grass-grub4arvae: — R^strfts-shDwed~that~pBG3 r 5 was able to complement the 
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pADK-10 mutant, but was unable to induce any symptoms of amber disease when placed 
in trans with pGLA-20, indicating that there must be another region on pADAP needed 
to induce amber disease. 

Restriction enzyme data of pGLA 20 and pBG35 suggested that the entire pathogenic 
region may reside within one of the large BamHl fragments of pADAP. A cosmid 
BamUI library of pADAP was made and screened using the 2.2kb EcoRI fragment derived 
from pBG35 (Fig 1) as the probe. Several probe positive clones were isolated; all shared 
similar restriction enzyme profiles. However, one (designated pMH32) was found to be 
smaller, measuring only 23kb in size compared with the 33kb of the other clones 
(eg pMH41 ; Fig lb). The difference between pMH32 and pMH41 was found to be a lOkb 
deletion at the left most end of pMH32 encompassing the one HindUI site (Fig. 1). 
E.coli strains containing pMH32 or pMH41 were bioassays against grass grub larvae and 
found to induce the full symptoms of amber disease (ie gut clearance and antifeeding 
activity). However, about ten days after infection a proportion of grass grubs fed the 
E.coli strains were found to recover from a diseased to a healthy phenotype. 

The plasmids pMH32 and pMH41 were subsequently introduced into a S. entomphila 
strain cured of pADAP (5.6RC) and the strains bioassayed against grass grub larvae. The 
strains gave the same disease progression as wild type and no larvae recovered, suggesting 
that the region cloned in pMH32 contained all the pathogenic determinants of p ADAP. 

Effect of copy number and mini-7/i70 insertions in pBM32 on disease-causing ability 

To facilitate mutagenesis and assess the effect of copy number on the disease process, the 
23kb BarriHI frapmem from pMK32 was cloned inic the medium copy plasmid pBR.522 
to give pBM32. A bioassay comparing the ability of pMH32 and pBM32 to induce amber 
disease against grass grub was undertaken. Results showed that there were no visual 
differences in the progression of amber disease between pBM32 or pMH32. The 
constiuct pBM32 was mutated with the mini-Tn/0 transposon derivative 103, and 
insertions mapped (Fig 2b). Bioassays of E. coli strains containing plasmids of the 
resultant mutants, showed that the disease determinants were confined within a central 
16.9 kb region (nucleotides 1955-18937 of SEQ ID NO: 1). 

All strains were non-pathogenic or fully pathogenic, and no partial disease phenotypes 
such as antifeeding, or gut clearance were noted. 

To confirm that no sequences at either end of the cloned fragment influenced the disease 
process, several detetioii-piasmids were made (Fig 2a). The large fragments resulting 
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from cleavage of the pBM32 -4, -8, -10, -20, -23, -24 and -35 plasmids with BamHl were 
cloned into the analogous site of pACYCl84. The resultant plasmids were transformed 
into the non-pathogenic S. entomophila strain 5.6RK, and assessed for pathogenicity . This 
analysis confirmed that the central 16.9 kb region (Fig 2a) was sufficient to induce the 
disease. 

Effect of mini-Tni0 insertions in pADAP on disease causing ability 

Grkovic et aL 1995 recombined by marker exchange the pGLA 20 based mutations -10 
and -13 into pADAP (Fig 2a). When bioassayed, S.entomophilia strains containing either 
of these mutant plasmids caused a partial condition including cessation of feeding but not 
gut clearance or amber colouration. This was in contrast to the complete abolition of 
disease observed in pADAP-cured S.entomophilia strains containing mutant pBM32 
plasmids with similar insertions. 

To determine the disease phenotype of the pBM32-based insertions in a pADAP 
background, the pBM32 based insertions were transferred into pADAP. pBM32 -1,-2, 
-4, -5, -6, -8, -9, -10, -21, -24, -30, -31 and -35 DNA fragments containing the inserted 
transposon and flanking DNA were cloned as independent fragments into pLAFR3 and 
the inserts recombined back into pADAP by marker exchange (Fig 2c). The resultant 
recombinant S.entomophilia strains were checked by Southern analysis to confirm that 
recombination had occurred as expected and no pLAFR3 vector sequences were present 
(data not shown). Mutations that did not affect the disease process in pBM32 also had no 
effect when recombined back into pADAP. However, strains with the pADAP mutants 
that totally abolished the disease process when in the pBM32 clone caused non-feeding 
bin not gui clearance of the grubs (Fig 2b. c). Hence, none of the pADAP recombinant 
strains completely abolished the disease process. This suggests that, while the 16.9kb 
fragment contains all genes required for pathogenicity, other genes contributing to the 
anti-feeding effect are present on some other part of pADAP. 



Assessment of plasmid stability during the course of the bioassay showed that greater than 
90% of the recombinant Serratia stains contained the clone of interest. 

Nucleotide Sequence analysis of the pathogenic region 

The large BamHI fragment (1 8937 bp) derived from the pBM32-8 was sequenced on both 
strands using a combination of constructed detections, plasmid subclones and custom 
made primers. A total continuous sequence of 18937 bp has been deposited Genebank 
(Accession number AF135182). Structural analysis of the DNA sequence using 
©NAMA^^howe(±tharthere was a 12-bp sequence repealedTive times between positions 



-20- 

683 and 743. The repeat is flanked by an upstream 13 base pair palindrome (669-682-bp), 
and a degenerate 34-bp downstream palindrome (765-799-bp) (Fig. 2d,e) 

Translation of the nucleotide sequence revealed nine significant open reading frames 
(ORFs). These together with their putative ribosomal binding sites and their base 
composition are listed in Table 2. Eight of the ORFs were oriented in the same direction 
and the other two in the opposite direction (Fig 2d). Sequence similarity searches showed 
that the deduced products of seven of these ORFs shared similarity with known proteins 
(Table 3). Products of three of the ORFs showed similarity to different protein 
components of insecticidal toxins of Photorhabdus luminescens (Bowen et al. 1998). 

These ORFs have been designated sep (sepA, sep B and sep C) for Serratia entomophila 
pathogenicity. 

Similarities of deduced amino-acid sequences to proteins in current database 

Results of database searches for homologous proteins are listed in Table 4. 

With reference to Fig 2d and Table 4, the following protein similarities were identified: - 
The protein product of sep A, had high similarity to the P .luminescens insecticidal toxin 
complex protein TcbA, TcdA, TcaB and TccB. These proteins shared three significant 
regions of predicted amino-acid similarity, at the amino-terminal region (SepA amino-acid 
residues 121-178), a central region (SepA amino-acid residues 960-1083) and, with 
greatest similarity, at the carboxyl terminus (SepA amino-acid residues 1630-2376) (Fig. 
4). However, there was little amino acid conservation around the putative proteolytic 
^ cleavage site of TcaB. TcbA and TcdA identified by Bowen et al. (1998). SepA also 

contained a region (residues 1057-1345) with weak similarity to the Clostridium 
bifermentans mosquitocidal toxin cbm71 (Barloy et al., 1996). 

SepB and the P. /uw/w ^ee^w-inseetieidal toxi n c omplex protein TcaC shared similarity 

throughout their length, and both SepA and TcaC showed high amino-terminal similarity 
to the Salmonella virulence protein spvB (Gulig et al. 1992) (Fig. 5). The similarity of 
SepB and TcaC to SprB diminishes after SpvB amino acid residue 356. 

SepC showed strong similarity to the amino-terminal of the insecticidal toxin complex 
protein TccC, up to amino-acid residue 663 of SepC. A number of putative bacterial cell 
wall proteins also have high similarity to SepC, including the wall associated protein 
precursor B. subtilis (WAP A) and members of the E.coli Rhs (recombination hot spot) 
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elements. Strong similarity of SepC was also observed with hypothetical wall-associated 
proteins from Coxiella burnetii and Bacillus subtilus (Table 4). 

The translated sequences of ORF1 and ORF2 showed no similarity to sequences in the 
current databases. ORF3 shared significant similarity to the morphogenesis protein of the 
Bacillus subtilis bacteriophage B103, a member of bacteriophage muramidase-type lysis 
proteins (Pecenkova et al. 1996). However, relative to size, the gpl9 protein of S. 
typhimurium phage ESI 8 (146 amino-acid residues) or the nucD/regB phage lysozymes 
of S.marcescens (179 amino-acid residues) are more similar. ORF4 showed similarity to 
E.coli bacteriophage N15gp 55 protein, a protein of unknown function (Zimmer et al, 
1998). 

Located in the same orientation as the sep genes and 134bp downstream of the SepC 
termination codon is a 204 base pair region assigned ORF5, which has high similarity to 
a S. typhimurium revolvase/invertase protein. However ORF5 is disrupted by two stop 
codons at amino-acid residues 19 and 64, making it unlikely that an active 
resolvase/invertase protein, is encoded by this region. A 256-bp region of encompassed 
by QRF5 (17498-17754) showed high similarity (77% identity) to the region (AF020806; 
1629-1885 bp) encoding S, typhimurium DNA invertase gene (Valdivia et al. 1997), 
suggesting a similar ancestral origin. 

Downstream of ORF5 and oriented in the opposite direction from 18935-18163 was a 870 
basepair region of DNA designated ORF6 whose product showed high amino-acid 
similarity over two different reading frames to the insertion element 7591 of E. coli 
(Mendiola et al. 1992). The translated sequence is interrupted ai amino-acid residue 149 
of the 7591 element and jaiei iesumed on a second reading frame, before us similarity 
switched back to the original reading frame. Switching of ORF's is a common feature of 
members of the IS3 family where the transposase is encoded by this overlapping ORF's 
j[Prem_eLaL^1990) However^-the-swifek^aek^o^ ORF6 may" 

therefore be a dysfunctional relic of an ancestral IS element. It is unknown whether ORF6 
contains a ribosomal binding site as its predicted location would lie outside the sequenced 
region. There was no DNA similarity to the IS91 element. 

Analysis for protein motifs showed that a tripeptide cell-binding motif Asp-Gly-Arg 
(RGD), implicated in the binding of various adhesion proteins produced by parasites and 
viruses to eukaryotic cells (Leininger et al., 1991), is present in SepA and the P. 
luminescens TcdA, TcbA, and TcaB proteins (Fig. 4). The RGD motif is present in cell 
sur£a.C-e„adhesions^ 
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filamentous heamagglutinin (220 kDa) (Relman et al, 1989) and the outer membrane 
protein pertactin (69 kDa) (Leininger et al, 1991). These motifs have been implicated in 
enhancing the binding of B. pertussis to eukaryotic cells. Because the RGD motif found 
in SepA falls in a region of high similarity between SepA and its P. luminescens 
counterparts, it may play a role in mediating the attachment of the protein and/or the 
bacteria to the insect cell wall. 

The hydropathicity profile of each of the Sep proteins was examined using the Kyte and 
Doolittle algorithm (Kyte and Doolittle, 1982) and compared to the relevant P. 
luminescens homologues. None of the Sep proteins contained a positively charged amino 
terminus followed by a hydrophobic region, characteristic of a signal sequence (Gierasch, 
1989). The profiles of SepA, TcbA and TcdA were very similar (data not shown) and 
each exhibited a steep hydrophilic peak at the carboxyl terminus (residues 2055-2061 of 
SepA), specifically the protein sequence RRRRE (Fig. 4). Although both SepB and TcaC 
shared similarity to the Salmonella virulence protein SpvB, the amino-terminil of SepB 
and TcaC were hydrophilic as opposed to the hydrophobic nature of SpvB. The profile 
of SepC and its Photorhabdus counterpart TccC differed in that SepC had a slightly 
hydrophilic amino-terminus, whereas TccC lacked a hydrophilic amino-terminus and had 
a significantly hydrophobic carboxyl terminus from amino-acid residue 717 onwards (Fig. 



Analysis to detect repetitive motifs characteristic of the RTX family of toxins (Welch, 
1991) using DOTPLOT showed only P. luminecens TccC contained a plot characteristic 

of a repeat motif present at the carboxy terminal (data not shown). 

Analysis of DNA composition (%GC) and similarity 

Comparisons of the GC content (Table 3) showed that the sepA and sepB genes were more 
GC-rich than their P. luminescens counterparts, while sepG and tcaC had similar GC 
-content. T h e highrGG-eontentof sep C d Jt&icc C may b e- a llr ibu l ed L o t h e cl ose r ela t io nship 
of these protein products to the rhs family of wall-associated proteins which have a 
GC-rich core of 62% (Wang et al, 1998). Comparisons of the GC content of the sep 
genes with that of the S. entomophila genome shows that they are rather similar, 
suggesting that the sep genes were not recently acquired by S. entomophila. 

Identification of mini-Tni0 location by sequence analysis 

Analysis of the insertion points of the previously isolated mini-Tn70 insertions (Fig. 2) 
within the putative ORFs (Table 4) revealed that ORP3 and ORF4 were interrupted by the 

~97-23r^4-f0RF3}-and^f^^ 



3). 
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pathogenicity process, suggesting that ORF3 and ORF4 do not play a significant role in 
pathogenicity. However the pADAP-35 mutation was at the 3' end of ORF4, resulting in 
a truncation of the final 1 1 amino-acid residues of ORF4 (Fig. 4), which may not have 
affected protein function. Further mutagenesis of ORF4 is therefore required to confirm 
that it has no role in pathogenicity. The mutations that caused loss of pathogenicity all 
resided within sepA, sepB or sepC. No mutation mapped to ORF1, ORF2 or ORF5. 

SUMMARY 

The bacteria Serratia entomophilia and S. proteamaculans cause amber disease in the 
grass grub, Costelytra zealandica (Coleoptera: Scarabaeidae), an important pasture pest 
in New Zealand. Larval disease symptoms include amber colouration, clearance of the 
gut and rapid cessation of feeding, before eventual death. The region containing 
pathogenic determinants of the disease has been cloned, and further defined by 
mutagenesis and deletion analysis to a 16.9 kb region. Sequence analysis of the minimal 
pathogenic encoding region showed significant protein homology, but little sequence 
homology to a group of newly described toxins from a member of the Enterobacteriaceae, 
Photorhabdus luminescens. This pathogenicity-encoding region from S. entomophila 
plasmid pADAP is the subject of the invention. The proteins encoded by the genes (sepA, 
sepB, sepC) within the 16.9 kb region can be used for insect control whether as an 
inundative pesticide, within baits or expressed in other organisms such as plants or 
microbes. 

It will be appreciated that it is not intended to limit the invention to the aforementioned 

examples only, many variations, which may readily occur to a person skilled in the an ; 
being possible withoui departing- from the scope thereof. 
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Table 1 Bacterial strains, plasmids and bacteriophage used in the study 



Bacteria 

Escherichia coli 
DH5ot 



Description 



DH10B 



DF1 

MC1061 
MC4100 



XLl-BlueMRA 



F<(>80d lacZpMIS p(/acZY A-^F)U 1 69 recAl 
end Al supE44 

F'mcrA p(mrr-hsdRMS-mcrBC)$80d lacZpM\5 
placX74 endAl recAl deoRp(ara, leu) 7697 
araD\39 galU galK nupG rpsL X'. 
y5 transposase(/w/vl) 

sup° hsdR mcrB araD\19 piaraA BC-leu)1619 
placXIA galU galKrpsL thi 
araD139 p(/acZYA-argF)U169 rpsL150 
St R relAl JlbB5301 deoCl ptsF25 
rbsR. 

p(mcrA)\%3 p{mcrCB-hsdSMR-mrf)l 73 endAl 
supE44 thi- J reAl gyrA96 relAl 
Serratia entomophila 

A1M02 Ap R , pADAP, pathogenic. 

5 .6 heat cured pADAP minus derivative of A 1 M02 

5.6RC Cm R recA' pADAP minus strain 

5.6RK Kn R recA' pADAP minus strain 

Plasmids 

pACYC184 Cm R Tc R 

pADAP Amber disease associated plasmid 

pBR322 Ap R ,Tc R 

pBM32 23-kb BamHl fragment from pMH32 cloned in 

pBR322 

pBM32-l-40 pBM32 containing mini-7>?/0 insertions 

pDELTAl Ap R , Sm R , Kn R , sucrose R 

pLAFR3 Tc R pRK290 with Xcos, lacZa and multi- 

cloning site from pUC8. 

pRK201.2 ]ncP : Kn R Tra RK2 repRK2 repEl 

pGLA2({ 10.6-kb HmA\W dADAP frasmem cloned ir. 

* pLAFR5 
pACp4 19-kb BamHl fragment from pBM32-4 cloned in 

pACYC184 

pACp8 1 7-kb BamHl fragment from pBM32-8 cloned in 

p AC YC 1 84 



Reference 



Hanahan(1983) 

Lorow and Jessee, 
(1990) 

Gibco BRL 
Casadaban and Cohen, 
(1980) 

Silhavy et al. 
(1984) 

Stratagene 



Grimont et al (1988) 
Glare etal (1993) 
Grkovic etal (1996) 
this study 

Chang and Cohen, 
(1978) 

Glare etal 1993) 
Bolivar etal (1977) 
this study 

this study 
Gibco BRL 

Staskawicz et al ( 1 987) 

Dina et al (1980) 
Corben (unpublished) 

this study 

this study 



pACplO 

pACp20 

pACp23 

pACp24 

pADK-10 

pADK-I3 

pADK-35 



19.5-kb BamHl fragment from pBM32-10 
cloned in pACYC184 

20- kb BamHl fragment from pBM32-2G cloned 
inpACYC184 

21- kb BamHl fragment from pBM32-23 cloned 
inpACYC184 

21 .2-kb BamHl fragment from pBM32-24 
cloned in pACYC184 

pADAP::mini-Tn70 insertion in 10.6-kb Hindlll 
fragment, Kn R non-pathogenic 
pADAP::mini-Tn/0 insertion in 10.6-kb 
Hindlll fragment, Kn R non-pathogenic 
pADAP::mini-Tn70 insertion in 10.6-kb Hindlll 



this study 
this study 
this study 
this study 

Grkovic etal (1995) 
Grkovic e/a/.(1995) 
Grkovic et al (1995) 
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pMH32 
pMH41 
pBM32 
pUC19 



fragment, Kn R , pathogenic 

23-kb BamHl frgament of pADAP cloned into 

pLAFR3 

33-kb BamHl fragment of pADAP cloned into 
pLAFR3 

23-kb BamHl fragment of pMH32 cloned into 
pBR322 

Ap R , lacZa, multi-cloning site 



Bacteriophage 

kNK1316 mini-Tn70 derivative 103 donor A.b522 cl857 

Pam80 nin5 



this study 
this study 
this study 

Yannish-Perron, et al. 
(1985) 

Kleckner et al. (1991) 



Table 2 Position of genes and features of the predicted gene products encoded by sep genes 



ORF 



Putative ribosome-binding site 3 



Longest potential coding 
region 



Start at 
nucleotide 



Stop at nt 
(ORF size 
bp) 



sep %GC 

(P. luminscens 

homologue, %GQ 



sepA 



ATGGGACCATCAACGTAATGAA 
TGAGG 



24i: 



9547 
(7131) 



54 

(tcbA, 43; tcdA, 44) 



sepB 



CGAGGAGACTGAGCATGCAA 



9598 



13885 
(4287) 



58 

(tcaC,5\) 



sepC 



AC AGGAGATC AC ATG AGC 



14545 



17467 
(2922) 



55 

(tccC, 54) 



ORF1 



CATAGAGACTGTCGCTATGTTA 



1287 



1587 
(300) 



39 



ORF2 



TTGGAGAATAACCGCCATGTT 



1590 



1863 
(273) 



39 



ORF3 



GGGGGAGAAAAATGAAG 



I860 



2294 
(435) 



51 



ORF4 



TGACTGGGAAGGAGGGGGGGAC i 3908 
GGTGATGAG1 



1448? 



60 



ORP5 


T A A C G A G A CTTTTT AGC AAA A T 
GGCACTTT 


1761-1755, 1755-1773 


? 


ORF6 


GAGCATGGC-Mini-Tn / 0-8 * 


18934-18064 


? 



a Putative ribosome-binding sites are underlined, and potential start codons are in boldface; nt, nucleotides; ? 
degenerate or incomplete ORF. * ORF transcribed in opposing direction. 



Table 3. Comparisons of GC content between the Sep and P. luminescen genes 



Sep (%GC) 


P. luminescen toxin (%GC) 


sepA (54%) 


CCjbA (43%) tcdA (44%) 


sepfl (58%) 


tcaC (51%) 


sepC (55%) 


tCCC (54%) 



Table 4. Similarities of products of putative ORPs to protein sequences in the database 



detected using BlastP 



ORJF 
(a.a size) 


Protein 
homo- 
logue (a.a 
size) 


Degree of similarity 
%identity/%similarity 
(over) a.a residue - a.a 
residue 


Function of the homologous 
protein 


Organism 


Blast score 
Reference" 


SepA 
(2373) 


TcbA 
(2504) 


34/50(1675)41-1628* 
57/72 (751) 1630-2374* 


insecticidal toxin complex 
protein 


Photorhabdus 
luminescens 


0.0 

AF047457 




TcdA 
(2405) 


40/55 (24'58)* 


insecticidal toxin complex 
protein 


P. luminescens 


0.0 

Ensign et aL, 
(1997) 




TcaB 
(1189) 


38/54 (764) 1625-2374* 
29/50 (281)936-1198* 


insecticidal toxin complex 
protein 


P. luminescens 


e- 137 

AF046867 




TccB 
(1565) 


36/51 (859) 1575-2373* 
31/51(289) 930-1204* 


insecticidal toxin complex 
protein 


P. luminescens 


e -l36 

AF047028 




TcaA 
(1095) 


36/56 (90) 94-183* 
18/39 (530) 435-928* 


insecticidal toxin complex 
protein 


P. luminescens 


le- 8 

AF046867 




TccA 
(965) 


27/45 (186) 115-280* 


insecticidal toxin complex 
protein 


P. luminescens 


5e* 

AF047028 




Cbm71 
(613) 


24/41 (199) 1057-1250* 


Mosquitocidal toxin Cbm71 


Clostridium 
bifermentans 


g2 127309 


SepB 
(1428) 


TcaC 
(1485) 


49/63(1276)1-1263* 
64/78(152) 1270-1421* 


insecticidal toxin complex 
protein 


P. luminescens 


0.0 

AF046867 




SpvB 
(591) 


40/52 (357) 9-365* 


Salmonella virulence protein 


Salmonella 
tvphimurium 


4e^ 2 
S22664 


oepc 
(938) 


(1043) 


dd/OD {ojo) 


.... . 
lnsecticiual toxin complex 

protein 


P. luminescens 


o.u 

AF047028 




SC2H4.02 
(2183) 


23/34 (639) 68-677* 


Hypothetical wall associated 
protein 


Streptomyces 
coelicolor 


2e 12 

AL031514.1 




WapA 
(2334) 


22/34 (430) 255-677* 
20/36(613)48-625* 


Wall associated protein 
Precursor 


B. subtilis 


2e- 5 

S32920 




Y15898 

(334) 


21/34 (542) 181-684* 


hypothetical wall associated 

protein 


Coxiella burnetii 


9e 5 

Y15898 




Rhs core 
( i 4 2 0 ) 


21/3f (463) 237-677* 
2 i/Sc (2S5) 35-300* 


Rh?. core protein 


L. coh 


3e" 

AF044501 


ORP3 
(144) 


BB103G 
(263) 


45/62(142) 1-139* 


morphogenesis protein of 
bacteriophage B103 


Bacillus subtilis 


3e' 27 

CAA67646 




LZBP22 
(146) 


46/61 (139) 1-143 


Phage P22, lysozyme (E 
3.2.1.17) 


Salmonella 


le 24 

gi 138699 


OR£4 








E. coli 


ke=* 


(191) 


GpS5 

(181) 


28/42(188) 1 184* 


bacteriophage N15 protein 




AF064539 


ORF5 
(236) 


SprA 


75/79(68) 1-68 ♦ 


Resolvase/invertase homologue 


S. typhimurium 


7e' 19 

AF029069 
AF020806 


ORF6 
(310) 


ispy 


39/56(94) 130-197 ♦ T 
39/58 (94) 224-3 18 ♦ "2 - 
30/48 (76) 31 9-395 ♦ T 


ISP/ transposase 


E. coli 


4e" 28 
S23782 



Percent identities and similarities were calculated in relation to the deduced gene products of the sequenced 
ORF. * indicates position of amino-acid similarity in relation to sequence generated in this study. ♦ indicates 
position of amino-acid similarity in relation to data base protein sequence. # reading frame. " similarities were 
considered potentially significant if the BlastP score exceeded e' 5 . 
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Table 5 Positions of mini-Tn20 insertions 



iviini- 1 niv 


UKP 


rosiuon QownMrcain 01 


insertion 




initio rinn rrtHnn fHiri^ 






1 90 


Z*t 


KJS\Jr J 


14S 




scpA 


747 


Z / 


scpA 


1 fH7 




sspA 


1 HQ7 


0 


sepA 


1 797 
1 /Z / 


1Q 
JO 


sepA 


9897 
Zoo / 


Z 


S€pA 


7. 1 Q7 




sepA 


1777 




sepA 




1 Q 


S€pA 






sepA 


44/^7 


37 


sepA 


4467 


31 


sepA 


4627 


12 


sepB 


182 


22 


sepB 


172 


11 


sepB 


362 


10 


sepB 


2162 


35 


ORF4 


557 


13 


sepC 


2525 


8 




18937 


ORF4/-35 junction GGG CGC 7T//4 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT: Glare, Travis T 

Hurst, Mark R H 
Jackson, Trevor A 

(ii) TITLE .OF INVENTION: Insect icidal nucleotide sequences 

(iii) NUMBER OF SEQUENCES: 6 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: A J Park & Son 

(B) STREET: Huddart Parker Building, Post Office Square 

(C) CITY: Wellington 

(D) COUNTRY: New Zealand 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 18937 nucleotides (A) LENGTH: 5118 amino acids 

(B) TYPE: nucleotide (B) TYPE : amino acid 

(C) STRANDEDNESS : single (C) STRANDEDNESS : 

(D) TOPOLOGY: Linear (D) TOPOLOGY : Linear 

Hi) MCL-ECULF TYPE : DMA (::) MOLECULF TYPE.: FRCTEJN 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 
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ggatccgagt gaaggaatca tcggccgctt tatacgtttc agggtgaata cggttggccg 60 
caacgtggca atggatgttg tttgtgtcgg tatgaatcgc cgcaacgtac tggtgttctg 12 0 
acatacccag tgccgataaa ctgtgacgaa cactatcaaa gatgtgttcc gtcgacctga 180 
aagccaggat ttatttttac accaatggtt gggtgggctt cctttctgaa ctggtgcatc 24 0 
atttagccgg catcatcaaa agatgcatgg aaatacaaat atcatattta cagacaccca 300 
agttgatgac ctgctccgtg agttgaaatg ccgacggggg aaatcagcag ccttttcaac 3 60 
tcatggagca gggggaaatc aatcctcaat aacccgcatt ggatatcctg ccagtgtgca 42 0 
tttaaccttt ttagtgtgtt tccttaatat cccaatcgtt gaatcgctac atacggcaga 480 
cattagtatc tcacttatca tcaaagtaat atcacaccga gaatgctaat ttcatgatat 540 
gaaaacgttc cattaataaa ttttcagaaa cctaacacgg catttttatg ctgatcagtg 600 
aattgattgt ttctgaaaaa attaattgca cctctgccac ttatcagata aaaacacccc 660 

ctCCCCUcC LLLLltclti ute.ttc.ctC cttttct tec tC£t I t 1 c-t t cctCct t t Lc 7 2 C 

ttaatgattt tattaatgat tttactatag atgaatgtta acatgggtga taatttactt 780 

tactcaattt aattgttggt atgaccatgt tttagatgag tggcaeggat tcattattgt 84 0 

aaaaaaagta tctaaaacct ttaac agca a t.c cf .a rf fg a gg ai-ga ^cri-.n gaeaggaefefe Qnn — 

gattattgee attttttacg aaggaagatg aegggtgata aataataaaa aaaacaaaag 960 

tatagectta ggtatcgccg attacatcca gtaacactta ttgacttttt tttacttcta 1020 
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ccgttagcta taaatatgat atttaaatct gtatttttat ataaaaccag tttatgatgc 1080 

tggattggtc attaaagtcg ttatatgtga tcgttatctg tcattgattg gtgtttaatc 1140 

ttttattctt ccagtgaggt ttcaggggga atgtattggg taatcatact catgtcattt 1200 

gttgctttga tgttaaatta acgtgttcat tcattatgtt ctactgttgt ttctattgtc 1260 

cggaacgacc atagagactg tcgctatgtt aataggaata tttgactggt tatatgcgcc 132 0 

aagggttatc gctgcactct ctggggcgat ggtattcatc attacgcaag ataacttcat 13 80 

tggtgtcaga cgggtgttat tgttttttgt gtctttttta ctcggtttga cattttcaga 1440 

gacaacagct tccgttatca acttctatat cccgaatgat atacatatag gaaatgacct 1500 

tggtgccttt gttaccagcg ccgtgacggt gaagcttttt gttatcatta tgagcaagat 1560 

agagagaaaa tatcttggag aataaccgcc atgttccaaa tcatacttct taatgttaat 1620 

gccgtgattt gcttggctat tgccgtcaga ttattcctgt ggcgtatcaa tcataaaatg 1680 

aaaaacattg tcgtctcttt tattgctttt ctcattatta cggcgtgcgg cgctgtctcc 1740 

atcaggacga tgacggggga gtattactat gcggattggt ccgagacgat. cattaacctt 18 00 

tcgcttttcc tgtctgttta tatacgcaat ggcgaaatcc ttcggtgggg ggagaaaaa 1859 

atg aag ata agt tec cga ggt ate gca tta atc aaa gag ttc gaa ggt 1907 
Met Lys lie Ser Ser Arg Gly He Ala Leu He Lys Glu Phe Glu Gly 
1 5 10 15 

ctg cgc tta cac get tat cgc tgc gec get gac gtc tgg act gtc ggt 1955 
Leu Arg Leu His Ala Tyr Arg Cys Ala Ala Asp Val Trp Thr Val Gly 
20 25 30 

tat ggc cac acg gca ggg gtt aca aag ggt gac ate ate acg gtc gat 2003 

Tyr Gly His Thr Ala Gly Val Thr Lys Gly Asp He lie Th^.Val Asp 

gaa gec cag acg atg ctg aca aac gat att ace gta ttt gaa egg gcg 2051 
Glu Ala Gin Thr Met Leu Thr Asn Asp He Thr Val Phe Glu Arg Ala 
50 55 60 



gtc agt cag gec gtc gcg gtt cct ctg aat cag teg caa tac gat gee 2099 

Val Ser Gin Ala Val Ala Val Pro Leu Asn Gin Ser Gin Tyr Asp Ala 

65 70 75 80 

ctg gtt tct ttg gtt ttt aat att ggc cag ggg aat ttt aaa cgc tct 2147 

Leu Val Ser Leu Val Phe Asn He Gly Gin Gly Asn Phe Lys Arg Ser 

85 90 95 

acc ttg ttg aaa aaa etc aac aaa cag gac tat gtc ggc gee ggg aac 2195 

Thr Leu Leu Lys Lys Leu Asn Lys Gin Asp Tyr Val Gly Ala Gly Asn 

100 105 110 

gag ttt tta cgc tgg acc egg gee aat ggg aag gtc ctt ccc gga ctg 2243 



r Arg 
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3lu Phe Leu Arg Trp Thr Arg Ala Asn Gly Lys Val Leu Pro Gly Leu 
115 120 125 

att cgc cga cgc gaa get gaa egg gtg ttg ttt gag aaa ctg ggt gca 22 91 
lie Arg Arg Arg Glu Ala Glu Arg Val Leu Phe Glu Lys Leu Gly Ala 
130 135 140 

taa ccctttgcga cgtacccaca agatgaagat aacaccgcgt actgageggt 2344 

145 

ggcgcaacaa tgaataaatg actgtgtacg gcctgtcctt cacaaeggat gggaccatca 2404 

aegtaa tga atg agg caa gac att atg tat aat att gat gat att ctg 24 52 
Met Arg Gin Asp lie Met Tyr Asn lie Asp Asp lie Leu 
150 155 

gag aaa gtg aat get cca cga gca cgc ctg tea gaa gaa aac gat aca 2500 
Glu Lys Val Asn Ala Pro Arg Ala Arg Leu Ser Glu Glu Asn Asp Thr 
160 165 170 175 

gcg gtg acg ctg acg gat tta ttc teg cgt teg ttt ccc gag gtc aaa 254 8 
Ala Val Thr Leu Thr Asp Leu Phe Ser Arg Ser Phe Pro Glu Val Lys 
180 185 190 

aaa ate act ggc gac age ctg tea tgg gga gag gtc tgc tat ctg tac 25 96 
Lys lie Thr Gly Asp Ser Leu Ser Trp Gly Glu Val Cys Tyr Leu Tyr 
195 200 205 

agt cag gcg cag cac gaa cag aaa gaa aac egg etc acc gaa tec cgt 2 644 
Ser Gin Ala Gin His Glu Gin Lys Glu Asn Arg Leu Thr Glu Ser Arg 
210 215 220 

att ctg gee egg gcg aat ccc eta ctg gtg aat gee gtt cgc ctg gga 2692 
lie Leu Ala Arg Ala Asn Pro Leu Leu Val Asn Ala Val Arg Leu Gly 
225 230 235 

ata cgc cac gca ccc cgc act cgc age tat gat gac tgc ttt ggc tec 2740 

]je Arc C-]n A3 a A3 a G3y Ser Arc Se: Tyi Asp Asp Trp Fhe Gly Se: 
24 0 2 4 5 250 .255 

cgc gca gac cgt ttc gee cgc ccc ggc teg gtg gee tec atg ttc tea 2788 
Arg Ala Asp Arg Phe Ala Arg Pro Gly Ser Val Ala Ser Met Phe Ser 
260 265 270 



ccg gcg gcg tat ctg acc gag ctg tac cgt gag gcg aag gac ctg cat 2 83 6 
Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asp Leu His 
275 280 285 

ccg gac acc teg ctg ttc egg ctg gac ate egg cgt ccc gac ctg gcg 2884 
Pro Asp Thr Ser Leu Phe Arg Leu Asp lie Arg Arg Pro Asp Leu Ala 
290 295 300 

gcg ctg gee ctt age cag aat aat atg gac gac gag etc tec acc ctg 2 932 
Ala Leu Ala Leu Ser Gin Asn Asn Met Asp Asp Glu Leu Ser Thr Leu 
305 310 315 



age ctg tec aat gag eta ctg tat cgc ggt ate ggg gca gcg gaa ggg 



2980 
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3er Leu Ser Asn Gl^^u Leu Tyr Arg Gly lie Gly Al a Glu Gly 

320 325 330 335 

ctt gac gac gac age gtc agg gag ctg etc gec ggg tat cgc ctg acc 3 02 8 

Leu Asp Asp Asp Ser Val Arg Glu Leu Leu Ala Gly Tyr Arg Leu Thr 

340 345 350 

ggc ctg acc ccc tat cac tgg gcg tac gag gcg gec cgc caa gec att 3 076 

Gly Leu Thr Pro Tyr His Trp Ala Tyr Glu Ala Ala Arg Gin Ala lie 

355 360 365 

ct g 9t9 ca 9 9 ac cc 9 ac 9 ct 9 at 9 999 ttc a 9 c c 9t aat cc 9 9 a t gtg 3124 

Leu Val Gin Asp Pro Thr Leu Met Gly Phe Ser Arg Asn Pro Asp Val 

370 375 380 

gcg cag ctt atg gac cct gec tec atg ctg gec att gaa gec gat att 3172 

Ala Gin Leu Met Asp Pro Ala Ser Met Leu Ala lie Glu Ala Asp lie 

385 390 395 

tea ccg gag ctg tat cag ata ctg gec gaa gaa att acg aca gac agt 322 0 

Ser Pro Glu Leu Tyr Gin He Leu Ala Glu Glu He Thr Thr Asp Ser 

400 405 410 415 

tac gaa gca etc tgg agt aag aat ttt ggt gat atg cct ccc tec tea 3268 

Tyr Glu Ala Leu Trp Ser Lys Asn Phe Gly Asp Met Pro Pro Ser Ser 

420 425 430 

ctg tta tct tat gat gca ctt gca aca ttt tat gat ctt gat tac gat 3316 

Leu Leu Ser Tyr Asp Ala Leu Ala Thr Phe Tyr Asp Leu Asp Tyr Asp 

435 440 445 

gag eta act teg tta ttg tea tta agg ctg gac ttt tea aat cca aac 3364 

Glu Leu Thr Ser Leu Leu Ser Leu Arg Leu Asp Phe Ser Asn Pro Asn 

450 455 460 

aat gaa tac tac att aat agt caa tta agt gtc. gta act ctg aat gaa 3412 

Asn Glu Tyr Tyr He Asn Ser Gin Leu Ser Val Val Thr Leu Asn Glu 

465 470 475 

age act get tta ata act ata cat cat tat tta aca acc eta ccc cc a 3 4 6 C 

Ser Th: Gly Leu lie Th: ile Hie Hie Tyi Leu Arc Thr Leu Gly Gly 

480 485 490 495 

gac tea cag cag att aac cct gag ctt ata cct tat ggg gat gga aca 3508 

Asp Ser Gin Gin lie Asn Pro Glu Leu lie Pro Tyr Gly Asp Gly Thr 



-5-OQ- ■ 505 5TTT 



tat ctt tat aat ttc age gtg gtg tea acg ata tea gag gat agt ttc 3556 

Tyr Leu Tyr Asn Phe Ser Val Val Ser Thr lie Ser Glu Asp Ser Phe 
515 520 525 

aaa eta ggg teg tta ggt tct aac agt age aat ctt tac tct ggg gat 3604 

Lys Leu Gly Ser Leu Gly Ser Asn Ser Ser Asn Leu Tyr Ser Gly Asp 
530 535 540 

tat cag ctt caa aaa ggg gtt cgc tat age att cct gtt gaa ata gat 3652 

Tyr Gin Leu Gin Lys Gly Val Arg Tyr Ser lie Pro Val Glu lie Asp 

545 550 555 
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gaa gga aag tta aat gat ggg ate aca ata gga ttg agt agg aaa ggg 3700 
Glu Gly Lys Leu Asn Asp Gly lie Thr lie Gly Leu Ser Arg Lys Gly 
560 565 570 575 

ggg gga tat tac tea aca gta aac ttc act ctg att gaa tat gat cct 3748 
Gly Gly Tyr Tyr Ser Thr Val Asn Phe Thr Leu lie Glu Tyr Asp Pro 
580 585 590 

gcg ata ttc att ctt aaa tta aat aaa gtt ate cgc eta tac aag gee 3796 
Ala lie Phe lie Leu Lys Leu Asn Lys Val lie Arg Leu Tyr Lys Ala 
595 600 605 

acg ggc atg ace acg gcg gaa ata tat caa ate ace aat att ctt aat 3 844 
Thr Gly Met Thr Thr Ala Glu lie- Tyr Gin lie Thr Asn lie Leu Asn 
610 615 620 

aac ggt etc ace att gac cat gcg gtc ctg agt aaa ate ttc ctg gtc 3 892 
Asn Gly Leu Thr lie Asp His Ala Val Leu Ser Lys lie Phe Leu Val 
625 630 635 

cgt tac ctg atg cgt cac tat cag ctt gat gtg gee egg tea ctg ata 3940 
Arg Tyr Leu Met Arg His Tyr Gin Leu Asp Val Ala Arg Ser Leu lie 
640 645 650 655 

ttg tgc aac gga acc ate agt gac cag gcg ttc age ggc gaa ace ggc 3 98 8 
Leu Cys Asn Gly Thr lie Ser Asp Gin Ala Phe Ser Gly Glu Thr Gly 
660 665 670 

ctg ttc acc acg ctg ttc aac acc cca ccg ctg aac ggc cag ctg ttt 4036 
Leu Phe Thr Thr Leu Phe Asn Thr Pro Pro Leu Asn Gly Gin Leu Phe 
675 680 685 

tct gca gat gat acc ccc etc gac tta cgc tct gaa gca ccg gag gat 4084 
Ser Ala Asp Asp Thr Pro Leu Asp Leu Arg Ser Glu Ala Pro Glu Asp 
690 695 700 

act ttc cct etc acc eta etc aaa ccc cca ttt aac ate acc ccc tec 4 
A] a Phe Arc i-tv Sc: Va j Let L-*/5 Arc AL ± 7- he Asn jie Se: Al i S t z 
705 710 715 

ggg ctt tec acg etc tgg cag ttg gee age ggt gac age age get ggg 418 0 
Gly Leu Ser Thr Leu Trp Gin Leu Ala Ser Gly Asp Ser Ser Ala Gly 
720 725 730 735 



ttt age tgc tct get gac aat ate gee gca etc tac cga gtg aaa etc 4228 

Phe Ser Cys Ser Ala Asp Asn lie Ala Ala Leu Tyr Arg Val Lys Leu 

740 745 750 

ctg get gac ate cac gac eta tec get ggt gag ctg tea atg ttg ctg 4276 

Leu Ala Asp lie His Asp Leu Ser Ala Gly Glu Leu Ser Met Leu Leu 
755 760 765 

tec gtc tec cct ttc age ggg gtg gec gee ggc teg ctg tec gat aat 4324 

Ser Val Ser Pro Phe Ser Gly Val Ala Ala Gly Ser Leu Ser Asp Asn 
770 775 780 



gag ctg acg cag ttt ctg tac cag acc acc acc tgg etc acg gag cag 



4372 
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Glu Leu Thr Gin Phe^^^u Tyr Gin Thr Thr Thr Trp Leu Glu Gin 

785 790 795 

ggc tgg acg gtc age gat gtg ttc ctg atg ctg acg acg cag tac ggt 4420 
Gly Trp Thr Val Ser Asp Val Phe Leu Met Leu Thr Thr Gin Tyr Gly 
800 805 810 815 

acc ctg ctg acc ccc gac att gag aac ctg etc get tec ctg cgc aac 4468 
Thr Leu Leu Thr Pro Asp lie Glu Asn Leu Leu Ala Ser Leu Arg Asn 
820 825 830 

gga ctg teg ggc cgt gag ctg ttc ccg gaa acg etc ccc ggc gat ggc 4516 
Gly Leu Ser Gly Arg Glu Leu Phe Pro Glu Thr Leu Pro Gly Asp Gly 
835 840 845 

get ccc ttt att gee gee gee atg cag ctg gac gec acg gat acg gcg 4564 
Ala Pro Phe He Ala Ala Ala Met Gin Leu Asp Ala Thr Asp Thr Ala 
850 855 860 

aag gcg atg ctg act tgg gcg gac cag ttg aag cca gag ggg ctg acg 4 612 
Lys Ala Met Leu Thr Trp Ala Asp Gin Leu Lys Pro Glu Gly Leu Thr 
865 870 A 875 

ctg acg gaa ttt att ctt ttg gtg atg aat gee gee cca aat gac gag 
Leu Thr Glu Phe He Leu Leu Val Met Asn Ala Ala Pro Asn Asp Glu 
880 885 890 " 895 
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cag gcg ggc cag atg gca ggg ttc tgc caa gee ctg tgg caa ctg gca 
Gin Ala Gly Gin Met Ala Gly Phe Cys Gin Ala Leu Trp Gin Leu Ala 
900 905 " 910 

ctg ate ate cgc age acc ggc etc age acg cgc gag ctg acg ctg ctg 

Leu He He Arg Ser Thr Gly Leu Ser Thr Arg Glu Leu Thr Leu Leu 
915 920 925 

gtc age cag ccg gga cgc ttc cgc aca gga tgg cac cat ctg ccc cat 

Val Ser Gin Pro Gly Arg Phe Arg Thr Gly Trp His His Leu Pro His 

930 935 940 

^ ? £C cc 9 ?cc ctt ccc gac an acc cci tti cai ccr c:c ctt aac 4652 

Asp Leu Pro Ala Leu Arg Asp He Thr Arg Phe His Ala Val Val Asn 

945 950 955 



4708 



4756 



4804 



4900 



4948 



cgc age ggc age cat gec ggg gag gtc ctg acc gca ctt gag acc gga 
Arg Ser Gly Ser His Ala Gly Glu Val Leu Thr Ala Leu Glu Thr Gly 
~96u 9~65 STXJT 9T5~ 

gaa ctg teg tea gee ctg ctg gee egg gec ctg tea cag aat gag cag 
Glu Leu Ser Ser Ala Leu Leu Ala Arg Ala Leu Ser Gin Asn Glu Gin 
980 985 990 

gat gtg acc ggc gee ttg gcg cag gtg agg ggg gee ggt gaa cag gac 4 996 
Asp Val Thr Gly Ala Leu Ala Gin Val Arg Gly Ala Gly Glu Gin Asp 
995 1000 1005 



aac age gtg ttc acc tec tgg gaa gag gtg gac cag get gag cag tgg 
Asn Ser Val Phe Thr Ser Trp Glu Glu Val Asp Gin Ala Glu Gin Trp 
1010 1015 1020 



5044 
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ctg gac atg agt gag acc ctg tec att acg cca tec ggt ctg get age 5092 
Leu Asp Met Ser Glu Thr Leu Ser lie Thr Pro Ser Gly Leu Ala Ser 
1025 1030 1035 

ctg att gee ctg aag tac ate aat gtg tec gat gac agt gca ccg ttg 5140 
Leu lie Ala Leu Lys Tyr lie Asn Val Ser Asp Asp Ser Ala Pro Leu 
1040 1045 1050 1055 



tac age cag tgg cag gtg gta tec ggt ctg ctg cag gee ggg ctg aaa 5188 
Tyr Ser Gin Trp Gin Val Val Ser Gly Leu Leu Gin Ala Gly Leu Lys 
1060 1065 1070 



age age cag age teg gcg ctg cac gat tat ctg gag gag ggg acc age 
Ser Ser Gin Ser Ser Ala Leu His- Asp Tyr Leu Glu Glu Gly Thr Ser 
1075 1080 1085 

age gee ctt tgt gcg tat tat ctg cgt aat ctg gca ccg aac atg gta 
Ser Ala Leu Cys Ala Tyr Tyr Leu Arg Asn Leu Ala Pro Asn Met Val 
1090 1095 1100 



5236 



5284 



tec ggg cgc gat gac etc ttc ggg tat ctg ctg ctg gat aat cag gtg 5332 
Ser Gly Arg Asp Asp Leu Phe Gly Tyr Leu Leu Leu Asp Asn Gin Val 
1105 1110 1115 



tea gee aag gta aaa acc acc cgc att gcg gag gee ate gee ggc ata 53 8 0 
Ser Ala Lys Val Lys Thr Thr Arg lie Ala Glu Ala lie Ala Gly lie 
1120 1125 1130 1135 



egg ctg tat ate aac egg gee ctt aac gga ata gaa etc age gee atg 542 8 
Arg Leu Tyr lie Asn Arg Ala Leu Asn Gly lie Glu Leu Ser Ala Met 
1140 1145 1150 



gca gag gtg agg ggg cgt cag ttt ttc act gac tgg gat acg ttc aac 5476 

Ala Glu Val Arg Gly Arg Gin Phe Phe Thr Asp Trp Asp Thr Phe Asn 

1155 1160 1165 

aaa cci Lac age acc tec gee ccc etc tCc qac etc ctt tar tat ccc 5524 

L v s A r c Se? 7 'h 2 1 rr A j a G 3 y Vcl £ e 2 G j u he v V a 1 Ty : Ty : Pre 

1170 1175 - 1180 



gaa aac tac etc gac ccg acg gtc cgt ate ggg cag acc ggc atg atg 5572 
Glu Asn Tyr Leu Asp Pro Thr Val Arg lie Gly Gin Thr Gly Met Met 
1185 1190 1195 



ga c acc ctg ctg cag tct g ^e— ag e cag age agt at e aac c gc gat acc 5 620 

Asp Thr Leu Leu Gin Ser Val Ser Gin Ser Ser lie Asn Arg Asp Thr 
1200 1205 1210 1215 



gtg gag gat gee ttt aaa acc tat ctg acc acg ttt gag cag att gee 5668 
Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Thr Phe Glu Gin lie Ala 
1220 1225 1230 



aat ctg aac act gtc age gga tat cac gat aac gee age atg acg cag 5716 
Asn Leu Asn Thr Val Ser Gly Tyr His Asp Asn Ala Ser Met Thr Gin 
1235 1240 1245 



ggg act aca tgg tat gtg ggt cgc age ate aca gat cag act aac tgg 



5764 
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Sly Thr Thr Trp Ty^^Jl Gly Arg Ser lie Thr Asp Gl r Asn Trp 

1250 ' 1255 1260 

tac tgg cgc age gec aac cac age aaa ate caa gac tea atg atg ccc 5 812 
Tyr Trp Arg Ser Ala Asn His Ser Lys lie Gin Asp Ser Met Met Pro 
1265 1270 1275 

gcg aat gee tgg ace gga tgg aca aaa att aac tgc gga atg aat ccg 5 860 
Ala Asn Ala Trp Thr Gly Trp Thr Lys lie Asn Cys Gly Met Asn Pro 
1280 1285 1290 1295 

tgg tea gat ctt gtg tgc teg gtg ttt ttc aac agt cgc ctt tat gtc 5908 
Trp Ser Asp Leu Val Cys Ser Val Phe Phe Asn Ser Arg Leu Tyr Val 
1300 1305 1310 

gtc tgg gtc gaa gag aat cag tct get gat acg gag gca gag age acg 5 95 6 
Val Trp Val Glu Glu Asn Gin Ser Ala Asp Thr Glu Ala Glu Ser Thr 
1315 1320 1325 

aca ace acg cag cag age tac acg ctg aaa ctg teg ttc egg cgc tac 6004 
Thr Thr Thr Gin Gin Ser Tyr Thr Leu Lys Leu Ser Phe Arg Arg Tyr 
1330 1335 1340 

gac ggt aca tgg agt tec ccg gtg teg ttc gac att ace ggc aac ate 6052 
Asp Gly Thr Trp Ser Ser Pro Val Ser Phe Asp lie Thr Gly Asn lie 
1345 1350 1355 

gca ttt ccg gaa acg cag ggc atg cat gtg ace tgt aat ccc ctg act 6100 
Ala Phe Pro Glu Thr Gin Gly Met His Val Thr Cys Asn Pro Leu Thr 
1360 1365 1370 1375 

gag cag etc tat tgc gcg ttt tac tec gtc acc age aag ccg gac ttt 6148 
Glu Gin Leu Tyr Cys Ala Phe Tyr Ser Val Thr Ser Lys Pro Asp Phe 
1380 1385 1390 

gat aac get cag ctg att tct gtg gat aat gat atg acg eta aat gtc 6196 
Asp Asn Ala Gin Leu lie Ser Val Asp Asn Asp Met Thr Leu Asn Val 
1395 1400 1405 

ate tea gat ata ccc att ttt aac acc etc act. car caa ttt aat acc 6 2 4 <! 
jje Se: A£> j j e GJy lie F-ht Lys Sex Val Sei Hi£ Glu Phe Asn Thr 
1410 1415 1420 



age act gag aaa ttt att aat aat gtt ttt tea gac cct tec get aat 6292 

Ser Thr Glu Lys Phe He Asn Asn Val Phe Ser Asp Pro Ser Ala Asn 

1425 14^0 1-4-35 

tat ttt gtc agt gca acg agt tta att gat gat gtt ate cac age gat 6340 

Tyr Phe Val Ser Ala Thr Ser Leu He Asp Asp Val He His Ser Asp 
1440 1445 1450 1455 

ttc tea etc ctt aat tct aaa act aca agt act gtt ttt act aat gaa 63 8 8 

Phe Ser Leu Leu Asn Ser Lys Thr Thr Ser Thr Val Phe Thr Asn Glu 
1460 1465 1470 



gat tec tct ctt ttg acg cca gag ctt cat att aca gca aat gtt teg 
Asp Ser Ser Leu Leu Thr Pro Glu Leu His He Thr Ala Asn Val Ser 
1475 1480 1485 



6436 
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tgt ttt gtt agt act get ggc ate gee act caa tct ace ata gaa aaa 6484 
Cys Phe Val Ser Thr Ala Gly lie Ala Thr Gin Ser Thr He Glu Lys 
1490 1495 1500 

ttc gtt cag gca ggg ata gaa ttt gag gaa att aat ttt tat gca ggc 653 2 
Phe Val Gin Ala Gly He Glu Phe Glu Glu He Asn Phe Tyr Ala Gly 
1505 1510 1515 

cag gee gec ggc gga ttt gac gga ttt gtg gga gtg gat gtt tct aat 6580 
Gin Ala Ala Gly Gly Phe Asp Gly Phe Val Gly Val Asp Val Ser Asn 
1520 1525 1530 1535 

tea aaa gta tac cag gtc gga aaa gaa gca gtt ggt gtc act gta aaa 662 8 
Ser Lys Val Tyr Gin Val Gly Lys- Glu Ala Val Gly Val Thr Val Lys 
1540 1545- 1550 

tct tat tec gtc act ggc gtt agt ggt tct gtt gag tta ttt att gat 6676 
Ser Tyr Ser Val Thr Gly Val Ser Gly Ser Val Glu Leu Phe He Asp 
1555 1560 1565 

tea tea aat aaa tac ttc age gga att ttg tea gat aaa atg ata acc 6724 
Ser Ser Asn Lys Tyr Phe Ser Gly He Leu Ser Asp Lys Met He Thr 
1570 1575 1580 

get tta att age ggc agt aca tea aaa gtt aat tac gtg teg tct att 6772 
Ala Leu He Ser Gly Ser Thr Ser Lys Val Asn Tyr Val Ser Ser He 
1585 1590 1595 

ggc tct caa gat ttt tgg agt gta aag teg etc atg ccg gca ctt cag 6820 
Gly Ser Gin Asp Phe Trp Ser Val Lys Ser Leu Met Pro Ala Leu Gin 
1600 1605 1610 1615 



ata tat gaa tta ate gat gat ate ata ctg aca tec ggc gta aat ggg 
He Tyr Glu Leu He Asp Asp He He Leu Thr Ser Gly Val Asn Gly 
1620 1625 1630 



agt ctg caa tec ggg aat aat ctt ttc aac acc aaa teg ctg agt ttt 
Ser Leu Gin Ser Gly Asn Asn Leu Phe Asn Thr Lys Ser Leu Ser Phe 
1650 1655 1660 



6868 



act gaa att aaa tec tec cct tec get gaa tgg tat aat gat aag ctg 693 £ 
Th: Gju jje Lys Ser Try. P:c Sex A3 a Gju Trr Ty: Asn Asp Lys Leu 
163b 1640 1645 



ac c gtt a at acc agt gat atfe— gtt gaa gat gag ttt gac gtg acg ttt 

Thr Val Asn Thr Ser Asp He Val Glu Asp Glu Phe Asp Val Thr Phe 
1665 1670 1675 

acg ttc acc get gtc gat cag aat aac gtc gtg ctg gee gee egg acg 

Thr Phe Thr Ala Val Asp Gin Asn Asn Val Val Leu Ala Ala Arg Thr 

1680 1685 1690 1695 



6964 



-7-0-1^- 
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gee ata tta acc gtc att cga aac att aat aat gac act tec gtt ate 7108 
Ala He Leu Thr Val He Arg Asn He Asn Asn Asp Thr Ser Val He 
1700 1705 1710 



gca tta cgt aaa aat acg cgt ggc gcg cag tat att cgt ttc act gcg 



7156 
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m 



Ala Leu Arg Lys Asn Arg Gly Ala Gin Tyr lie Arg^P Thr Ala 

1715 1720 1725 

ggt aac gat gtg gcg ctt att cgc etc aac acc etc ttt gee cgc caa 7204 
Gly Asn Asp Val Ala Leu lie Arg Leu Asn Thr Leu Phe Ala Arg Gin 
1730 1735 1740 

ctg gtc gac egg gcg aat acc ggg att gac acc att ctt tec atg gag 7252 
Leu Val Asp Arg Ala Asn Thr Gly lie Asp Thr lie Leu Ser Met Glu 
1745 1750 1755 

acc cag agg ctt acc gaa ccc gec ctg gaa gag ggg agt gat gtg ttt 73 0 0 
Thr Gin Arg Leu Thr Glu Pro Ala Leu Glu Glu Gly Ser Asp Val Phe 
1760 1765 1770 ~ 1775 

atg gac ttc tec gga gee aat gee etc tat ttc tgg gag ctg ttc tat 7348 
Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr 
1780 1785 1790 

tac acg ccg atg atg gtg ttc cag egg ttg ttg cag gaa cag cac ttc 7396 
Tyr Thr Pro Met Met Val Phe Gin Arg Leu Leu Gin Glu Gin His Phe 
1795 1800 1805 

ccg gaa gee acc cgc tgg ctg cag tat gtc tgg aac ccg gee ggg cac 7444 
Pro Glu Ala Thr Arg Trp Leu Gin Tyr Val Trp Asn Pro Ala Gly His 
1810 1815 1820 

gtg gta aac ggg gtg ctg cag aat tac acc tgg aat gtc cgt ccg ctg 7492 
Val Val Asn Gly Val Leu Gin Asn Tyr Thr Trp Asn Val Arg Pro Leu 
1825 1830 1835 



gag gag gac acc ggc tgg aac gac teg ccg ctg gac tec att gac ccc 
Glu Glu Asp Thr Gly Trp Asn Asp Ser Pro Leu Asp Ser lie Asp Pro 
1840 1845 1850 1855 



7540 



gat gca ata gee cag tac gac ccc atg cat tac aag gtc gec acc ttt 758 8 

Asp Ala lie Ala Gin Tyr Asp Pro Met His Tyr Lys Val Ala Thr Phe 
I860 1865 1870 

£ L 9 tec tac etc cac cic etc att ccc ccc cci cat ccc ccc tac ccc 7 63 6 

Met Ser Tyr Leu Asp Leu Leu lie Ala Arg Gly Asp Ala Ala Tyr Arg 
1875 1880 1885 

ctg etc gag egg gac acc ctt aac gag gee egg atg tgg tac gtc cag 7684 

Leu Leu Glu Arg Asp Thr Leu Asn Glu Ala Arg Met Trp Tyr Val Gin 

1-8-93 1-8-9-5 190 0 — 



gee ctg aac ctt ctg ggc gac gag ccc tat att tec ttt gac gee gac 7732 

Ala Leu Asn Leu Leu Gly Asp Glu Pro Tyr lie Ser Phe Asp Ala Asp 
1905 1910 1915 

tgg teg gcg ttg acc ctg ggt gac gca gee age gag gtg acg cga cgc 7780 

Trp Ser Ala Leu Thr Leu Gly Asp Ala Ala Ser Glu Val Thr Arg Arg 
1920 1925 1930 1935 

gat tac cag gag gee ctg ctg gee gtg cgc egg ttg gtg ccc get ccc 782 8 

Asp Tyr Gin Glu Ala Leu Leu Ala Val Arg Arg Leu Val Pro Ala Pro 
1940 1945 1950 
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gag aca egg acg gcg aat tec ctg acg gca ctg ttc etc ccg cag cag 7876 
Glu Thr Arg Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Gin 
1955 1960 1965 

aac gag gtg etc aaa ggc tac tgg caa acc ttg gca cag egg etc cat 7924 
Asn Glu Val Leu Lys Gly Tyr Trp Gin Thr Leu Ala Gin Arg Leu His 
1970 1975 1980 

aac ctg cgc cac aac etc tec att gac ggc cag ccg ctt tec ctg tec 7972 
Asn Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Ser 
1985 1990 1995 

gtc tac gee acg ccg tec gaa ccg tec gee ctg cag agt gee gtc gtc 8020 
Val Tyr Ala Thr Pro Ser Glu Pro- Ser Ala Leu Gin Ser Ala Val Val 
2000 2005 2010 2015 

aac age gcg cag ggt get gca gca ctg ccg gee gcg gtg atg ccg ctt 8068 
Asn Ser Ala Gin Gly Ala Ala Ala Leu Pro Ala Ala Val Met Pro Leu 
2020 2025 2030 

tac agt ttc ccg gtc atg ctg gag aac gee egg ggg atg gtg age ctg 8116 
Tyr Ser Phe Pro Val Met Leu Glu Asn Ala Arg Gly Met Val Ser Leu 
2035 2040 2045 

ctg acc ggg ttc ggc aac aca ctg etc ggt att acc gag cgt cag gat 8164 
Leu Thr Gly Phe Gly Asn Thr Leu Leu Gly lie Thr Glu Arg Gin Asp 
2050 2055 2060 

gcg gag gcg ctg gee aaa ctg ctg cag acc cag ggc agt gaa ctg ata 8212 
Ala Glu Ala Leu Ala Lys Leu Leu Gin Thr Gin Gly Ser Glu Leu lie 
2065 2070 2075 

cgc cag ggc ctt cgc cag cag gat aac gtc etc gag gaa ate gat gcg 8260 
Arg Gin Gly Leu Arg Gin Gin Asp Asn Val Leu Glu Glu lie Asp Ala 
2080 2085 2090 2095 

cat atr ccc acc etc cac cac acc cac cgc ccc ccc cac ate cat ttt £308 
Asp ]]e AJa Aja Leu C-ju C-J u Sei Arc Arc Gjv A J a GJ r. Me: Arc Phe 
2 1 00 2105 2110 

gaa cgt tac aaa gtg ttg tac gag gcg gac gtc aac acc ggc gaa aaa 8356 
Glu Arg Tyr Lys Val Leu Tyr Glu Ala Asp Val Asn Thr Gly Glu Lys 
2115 2120 2125 

cag gee atg g ac t tg L ac c Lc ay L Le y Lee gtg crg-tcg— gca— fca acc 8404 
Gin Ala Met Asp Leu Tyr Leu Ser Ser Ser Val Leu Ser Ala Ser Thr 
2130 2135 2140 

gee gcg etc ttt ttg gee gag gee gcg gee gat atg ctg ccc aat att 8452 
Ala Ala Leu Phe Leu Ala Glu Ala Ala Ala Asp Met Leu Pro Asn lie 
2145 2150 2155 



tac ggg ctg gee gtc ggg ggc tec cgc tat ggg gca eta ttt aaa gee 
Tyr Gly Leu Ala Val Gly Gly Ser Arg Tyr Gly Ala Leu Phe Lys Ala 
2160 2165 2170 2175 



8500 



acc gee ate ggc ate cag gtg tec tec gat gee acc cgc ata tea gcg 8548 
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Thr Ala lie Gly lie Tffn Val Ser Ser Asp Ala Thr Arg^Fe Ser Ala 
2180 2185 2190 



gac aaa ate age cag teg gaa gtg tac cgc cgt cgc egg gag gag tgg 85 96 
Asp Lys lie Ser Gin Ser Glu Val Tyr Arg Arg Arg Arg Glu Glu Trp 
2195 2200 2205 

gaa ate cag cgt gat agt gcg cag tct gac gtg gcg cag att gat gee 8644 
Glu lie Gin Arg Asp Ser Ala Gin Ser Asp Val Ala Gin lie Asp Ala 
2210 2215 2220 

cag ctg gcg gee atg gca gtg cgc egg gaa ggg get gag ctg cag aaa 8692 
Gin Leu Ala Ala Met Ala Val Arg Arg Glu Gly Ala Glu Leu Gin Lys 
2225 2230 2235 

act tac ctt gag ace cag cag ace cag gca cag gcg cag ttg gca ttc 8740 
Thr Tyr Leu Glu Thr Gin Gin Thr Gin Ala Gin Ala Gin Leu Ala Phe 
2240 2245 2250 2255 

ctg cag agt aag ttc aac aat acg get ctg tac age tgg ctg egg ggc 8788 
Leu Gin Ser Lys Phe Asn Asn Thr Ala Leu Tyr Ser Trp Leu Arg Gly 
2260 2265 2270 

agg ttg tec gee att tat tac cag ttc tat gac ctg gca gta tec cgc 8836 
Arg Leu Ser Ala lie Tyr Tyr Gin Phe Tyr Asp Leu Ala Val Ser Arg 
2275 2280 2285 

tgc ctg atg gcg caa cag gec tgg cag tgg gat aaa ttc gag act agg 8 884 
Cys Leu Met Ala Gin Gin Ala Trp Gin Trp Asp Lys Phe Glu Thr Arg 
2290 2295 2300 

teg ttt ate cag ccg ggg gee tgg atg ggg gca aat gee ggt ctg ctg 8932 
Ser Phe lie Gin Pro Gly Ala Trp Met Gly Ala Asn Ala Gly Leu Leu 
2305 2310 2315 

gee ggg gaa acc ctg atg ctg aat ctg gcg cag atg gag cag gec tgg 8980 
Ala Gly Glu Thr Leu Met Leu Asn Leu Ala Gin Met Glu Gin Ala Trp 
2320 2325 2330 2335 

etc &cc ggc cc.; gac ccc ccc azc. cac etc ccc cgc ecc etc ice etc 9C7 i 
Leu Thr Gly Asp Glu Arg Ala lie Glu Val Thr Arg Thr Val Cys Leu 
2340 2345 2350 

teg gag gtc tat acc age etc gcg gag gat gcg gca ttc tct ctg gee 9076 
Ser Glu. Val Tyr Thr Ser Leu Ala Glu Asp Ala Ala Phe Ser Leu Ala 

233-5 2~3~60 2365 ~~ 

gac aag gtg gtg gaa ctg gtc agt aac ggt teg ggc agt gcg ggt acg 9124 
Asp Lys Val Val Glu Leu Val Ser Asn Gly Ser Gly Ser Ala Gly Thr 
2370 2375 2380 

aaa age aac gga tta cag atg gat caa cag caa etc gag gec acc ctg 9172 
Lys Ser Asn Gly Leu Gin Met Asp Gin Gin Gin Leu Glu Ala Thr Leu 
2385 2390 2395 



aaa ctg get gac etc ggt ate ggc aac gat tac ccg gtc tec ctt ggc 
Lys Leu Ala Asp Leu Gly lie Gly Asn Asp Tyr Pro Val Ser Leu Gly 
2400 2405 2410 2415 



9220 
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act ggg gca ctt aat tat cac age egg teg ggg aac ggc ccc mx: ggc 
Thr Gly Ala Leu Asn Tyr His Ser Arg Ser Gly Asn Gly Pro Phe Gly 
2580 2585 2590 

att ggc tgg ggt ate ggc ggt get get gtc cag cgt cgt acg cgc aac 
He Gly Trp Gly He Gly Gly Ala Ala Val Gin Arg Arg Thr Arg Asn 
2595 2600 2605 

gga gca cct ace tac gat gat act gat gaa ttc acc ggt ccg gac ggt 
Gly Ala Pro Thr Tyr Asp Asp Thr Asp Glu Phe Thr Gly Pro Asp Gly 
2610 2615 2620 2625 

gag gtg ctg gtg ccg gca etc acg get get ggc acc caa gaa gca egg 



acc atg agg cgc ate aaa caa ata age gtc acg etc ccg gcg ctg gtc 9268 
Thr Met Arg Arg He Lys Gin He Ser Val Thr Leu Pro Ala Leu Val 
2420 2425 2430 

ggc ccc tat cag gac gtc cgt gcg gtt etc age tac ggc gga agt atg 9316 
Gly Pro Tyr Gin Asp Val Arg Ala Val Leu Ser Tyr Gly Gly Ser Met 
2435 2440 2445 

gtc atg ccc egg ggt tgc age gcg ctg gcg gtc tea cac gga atg aac 9364 
Val Met Pro Arg Gly Cys Ser Ala Leu Ala Val Ser His Gly Met Asn 
2450 2455 2460 

gac age ggc caa ttc caa ctg gat ttc aat gac ccg cgt tac ctg ccg 9412 
Asp Ser Gly Gin Phe Gin Leu Asp- Phe Asn Asp Pro Arg Tyr Leu Pro 
2465 2470 2475 

ttt gaa gga ctt cca gtt gat gac aca ggg acc ctg aca ctg age ttc 9460 
Phe Glu Gly Leu Pro Val Asp Asp Thr Gly Thr Leu Thr Leu Ser Phe 
2480 * 2485 2490 2495 

ccg gat get gac ggc aaa caa cag gcg atg etc etc agt ctg age gac 9508 
Pro Asp Ala Asp Gly Lys Gin Gin Ala Met Leu Leu Ser Leu Ser Asp 
2500 2505 2510 

ate ate ctg cat ate cgt tac acc att ate age tga tag gtatcaacat 9557 
He He Leu His He Arg Tyr Thr He He Ser 
2515 2520 

agcgcaggcc cccgaacgag ggectgegag gagactgagc atg caa aat cat caa 9612 

Met Gin Asn His Gin 
2525 

gac atg gee att act gee ccc acg ttg cct tec ggg ggc ggt gcg gtc 9660 
Asp Met Ala He Thr Ala Pro Thr Leu Pro Ser Gly Gly Gly Ala Val 
2530 2535 2540 2545 

acc ccc etc aac cci cat ate ccc acc cca ccc ccc cat cct ccc ccc 970E 
Th: Civ hcv Lye Gjv Arc j J e A3 a. A J a A] a GJy Pre Ast C-3y A3 a Ala 
2550 2555 2560 

acc ctg agt att ccc ttg ccg gtt age ccc ggt egg ggt tac gee ccc 9756 
Thr Leu Ser He Pro Leu Pro Val Ser Pro Gly Arg Gly Tyr Ala Pro 
2565 2570 2575 



9804 



9852 



9900 



9948 
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Val Leu Val Pro^k Leu Thr Ala Ala Gly Thr Glo^iu Ala Arg 
2630^^ 2635 2640 



cag gcc acc tea eta ctg ggg ata aac cca ggc gga age ttc aac gtt 9996 
Gin Ala Thr Ser Leu Leu Gly He Asn Pro Gly Gly Ser Phe Asn Val 
2645 2650 2655 

cag gtt tac cgt tea cgt acg gag ggt agt etc age cgc ctt gag cgt 10 044 
Gin Val Tyr Arg Ser Arg Thr Glu Gly Ser Leu Ser Arg Leu Glu Arg 
2660 2665 2670 

tgg ctg ccc gcc gac gag aca gaa acg gaa ttt tgg gtg tta tat acc 10092 
Trp Leu Pro Ala Asp Glu Thr Glu Thr Glu Phe Trp Val Leu Tyr Thr 
2675 2680 2685 

cct gac gga cag gtg get ctg ctg ggc cga- aat gcg cag get cgc ate 10140 
Pro Asp Gly Gin Val Ala Leu Leu Gly Arg Ash Ala Gin Ala Arg He 
2690 2695 2700 2705 

age aac ccc aca gcc cca aca cag acg gcg gtt tgg ctg atg gag tec 10188 
Ser Asn Pro Thr Ala Pro Thr Gin Thr Ala Val Trp Leu Met Glu Ser 
2710 2715 2720 

teg gta tea ctt acc ggc gaa cag atg tat tac caa tac cgt gcg gaa 10236 
Ser Val Ser Leu Thr Gly Glu Gin Met Tyr Tyr Gin Tyr Arg Ala Glu 
2725 2730 2735 

gat gat gac ggt tgt gac gag gcg gag cgc gac gcg cac ccg cag gcc 102 84 
Asp Asp Asp Gly Cys Asp Glu Ala Glu Arg Asp Ala His Pro Gin Ala 
2740 2745 2750 

ggc gcc caa cgt tat ccg gtg gcg gtc tgg tat ggt aac cgt cag gcg 10332 
Gly Ala Gin Arg Tyr Pro Val Ala Val Trp Tyr Gly Asn Arg Gin Ala 
2755 2760 2765 

get egg acg eta ccg gcg ctg gtg teg aca cca tea atg gat age tgg 103 80 
Ala Arg Thr Leu Pro Ala Leu Val Ser Thr Pro Ser Met Asp Ser Trp 
2770 2775 2780 2785 

ctg ttt ate etc gtc ttt gat tat cat gac cgt age teg gtc etg tet 10428 

Leu Phe jJe Leu Va J Phe Asp Tyr G3y GHu Arc Ser Ser VaJ Leu Se: 
2790 219t 2800 

gaa gcg ccg gcc tgg caa aca cca gga agt ggg gag tgg ctg tgt cgt 10476 
Glu Ala Pro Ala Trp Gin Thr Pro Gly Ser Gly Glu Trp Leu Cys Arg 
2805 2810 2815 



cag gat tgt ttt tec ggg tat gag ttt ggt ttt aac ctg egg act cgc 10524 
Gin Asp Cys Phe Ser Gly Tyr Glu Phe Gly Phe Asn Leu Arg Thr Arg 
2820 2825 2830 

cgc ctg tgc cgt cag gtt ttg atg ttc cat tac eta ggt gtt ctg gcg 10572 
Arg Leu Cys Arg Gin Val Leu Met Phe His Tyr Leu Gly Val Leu Ala 
2835 2840 2845 

ggg agt teg gga gcg aat gat gcg cca gca ttg att tct cgc ctg ttg 10620 
Gly Ser Ser Gly Ala Asn Asp Ala Pro Ala Leu lie Ser Arg Leu Leu 
2850 2855 2860 2865 
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ctg gac tac agg gaa agt cct tea etc agt ctg etc gag aac gtg cac 10668 
Leu Asp Tyr Arg Glu Ser Pro Ser Leu Ser Leu Leu Glu Asn Val His 
2870 2875 2880 

cag gtg get tat gag teg gac ggg acg tct tgt gee ttg ccg gca ctg 10716 
Gin Val Ala Tyr Glu Ser Asp Gly Thr Ser Cys Ala Leu Pro Ala Leu 
2885 2890 2895 

gca ttg ggg tgg caa acc ttt acc ccg ccg aca ttg teg gca tgg cag 10764 
Ala Leu Gly Trp Gin Thr Phe Thr Pro Pro Thr Leu Ser Ala Trp Gin 
2900 2905 2910 

acg cgt gac gat atg ggc aag ttg agt ttg ctt caa ccc tat cag ctt 10812 
Thr Arg Asp Asp Met G-ly Lys Leu- Ser Leu Leu Gin Pro Tyr Gin Leu 
2915 2920 2925 

gta gac ctt aac ggc gaa ggt gtg gtg ggt ate ctg tat cag gac age 10860 
Val Asp Leu Asn Gly Glu Gly Val Val Gly lie Leu Tyr Gin Asp Ser 
2930 2935 2940 2945 

ggt gee tgg tgg tac cgt gaa ccg gta cgc cag teg ggg gat gat ccg 10908 
Gly Ala Trp Trp Tyr Arg Glu Pro Val Arg Gin Ser Gly Asp Asp Pro 
2950 2955 2960 

gat get gtg acc tgg ggg gcg get gcg gee ctg ccg aca atg ccc get 10956 
Asp Ala Val Thr Trp Gly Ala Ala Ala Ala Leu Pro Thr Met Pro Ala 
2965 2970 2975 

ttg cat aac age ggc ate ctg gcg gat ctt aat ggg gat ggt egg ctg 11004 
Leu His Asn Ser Gly lie Leu Ala Asp Leu Asn Gly Asp Gly Arg Leu 
2980 2985 2990 



gag tgg gtc gtt acc gee ccc ggt gtg gcg ggg atg tat gat cgc acc 
Glu Trp Val Val Thr Ala Pro Gly Val Ala Gly Met Tyr Asp Arg Thr 
2995 3000 3005 



gaa tat gcg cat cca aaa gca gtg etc gee gat ate ctg ggg get ggg 
Glu Tyr Ala His Pro Lys Ala Val Leu Ala Asp lie Leu Gly Ala Gly 
3030 3035 3040 

tta acg g ac atg gtg c LL a l e ggg ccy e ye agt gtt cgc etc tat tec 
Leu Thr Asp Met Val Leu lie Gly Pro Arg Ser Val Arg Leu Tyr Ser 
3045 3050 3055 



11052 



ccc ccc ccc cac tgc ttc cat ttc acc ccc etc tea ccc ttc ccc eta jjjOC 
Pre Gjv Arc Ast Trp Leu Hie Phe Th: Pre Leu fe: 7-.1 a Leu Pre Val 
3010 3015 3020 3025 



11148 



11196 



ggc aaa aac gat ggt tgg aat aaa ggg gag acc gtg cag caa acg gaa 11244 

Gly Lys Asn Asp Gly Trp Asn Lys Gly Glu Thr Val Gin Gin Thr Glu 
3060 3065 3070 

aga etc act ctg ccg gtc ccg ggg gtt gac cca cgt acc etc gtg gcg 11292 

Arg Leu Thr Leu Pro Val Pro Gly Val Asp Pro Arg Thr Leu Val Ala 
3075 3080 3085 

ttc agt gat atg get ggc agt gga cag cag cat ttg acg gag gtg cgt 1134 0 
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he Ser Asp Met Ala Gly Ser Gly Gin Gin His Leu Thr GXu Val Arg 
3090 3095 3100 3105 

get aat gga gta cgt tac tgg cca aac ctg ggg cac ggt cgt ttc ggt 113 8 8 
Ala Asn Gly Val Arg Tyr Trp Pro Asn Leu Gly His Gly Arg Phe Gly 
3110 3115 3120 

cag ccg gtg aat att ccc ggt ttt age cag tea gtg act acg ttt aac 11436 
Gin Pro Val Asn He Pro Gly Phe Ser Gin Ser Val Thr Thr Phe Asn 
3125 3130 3135 

cct gac cag ata ttg ctg gee gat ace gac ggt tec ggt ace acg gac 11484 
Pro Asp Gin He Leu Leu Ala Asp Thr Asp Gly Ser Gly Thr Thr Asp 
3140 3145 3150 

ctg att tat gcg atg agt gac egg tta gtc att tat ttc aac cag agt 11532 
Leu He Tyr Ala Met Ser Asp Arg Leu Val He Tyr Phe Asn Gin Ser 
3155 3160 3165 

ggt aat tat ttc gee gag ccg cat acg ctg etc ttg ccg aaa ggt gtg 11580 
Gly Asn Tyr Phe Ala Glu Pro His Thr Leu Leu Leu Pro Lys Gly Val 
3170 3175 3180 3185 

cgc tat gat cgc ace tgc agt ctg caa gtg gcg gat ate cag ggg ctg 11628 
Arg Tyr Asp Arg Thr Cys Ser Leu Gin Val Ala Asp He Gin Gly Leu 
3190 3195 3200 

999 gtg cct age ctg tta ctg acg gtc ccc cat gtc gcg cct cat cac 11676 
Gly Val Pro Ser Leu Leu Leu Thr Val Pro His Val Ala Pro His His 
3205 3210 3215 

tgg gtg tgc cat tta teg gca gac aaa ccc tgg ttg ttg aat ggc atg 11724 
Trp Val Cys His Leu Ser Ala Asp Lys Pro Trp Leu Leu Asn Gly Met 
3220 * 3225 3230 

aac aac aat atg ggg gee egg cat gca ctg cac tat cgc agt teg gtg 11772 
Asn Asn Asn Met Gly Ala Arg His Ala Leu His Tyr Arg Ser Ser Val 
3235 3240 3245 

cac ttc tec etc cai cac aaa ccc cac cca etc ccc cca ccc aci zee 11820 
^ Gin Phe Trp Leu Asp Glu Lys Ala Glu Ala Leu Ala Ala Gly Ser Ser 

3250 3255 3260 3265 

cct gec tgc tac ctg cca ttt aca ttg cat ace ctg tgg cgt teg gtg 11868 
Pro Ala Cys Tyr Leu Pro Phe Thr Leu His Thr Leu Trp Arg Ser Val 
— 3-2JL0 3275 3.2_8_0 



gtg cag gat gag ate ace ggt aac cgt ctg gtc age gac gtg ctt tat 11916 
Val Gin Asp Glu He Thr Gly Asn Arg Leu Val Ser Asp Val Leu Tyr 
3285 3290 3295 

cgc cac ggc gtc tgg gac ggg cag gaa cgc gag ttt egg ggg ttt ggt 11964 
Arg His Gly Val Trp Asp Gly Gin Glu Arg Glu Phe Arg Gly Phe Gly 
3300 3305 3310 

ttt gtt gag ate agg gat ace gat ace ttg gca age cag ggt acg gcg 12012 
Phe Val Glu He Arg Asp Thr Asp Thr Leu Ala Ser Gin Gly Thr Ala 
3315 3320 3325 
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acg gaa ctg agt atg cct tct gtg age egg aac tgg tat gee ace ggg 12 060 
Thr Glu Leu Ser Met Pro Ser Val Ser Arg Asn Trp Tyr Ala Thr Gly 
3330 3335 3340 3345 

gta ccg gca gta gac gag cgt ctg ccg gag acg tat tgg caa aac gat 12108 
Val Pro Ala Val Asp Glu Arg Leu Pro Glu Thr Tyr Trp Gin Asn Asp 
3350 3355 3360 

gee gee get ttt gee gat ttc gcg ace cgt ttc act gtc ggt tea gga 12156 
Ala Ala Ala Phe Ala Asp Phe Ala Thr Arg Phe Thr Val Gly Ser Gly 
3365 3370 3375 

gag gat gag cag aca tat act ccg gac gac age aag aca ttc tgg ttg 122 04 
Glu Asp Glu Gin Thr Tyr Thr Pro- Asp Asp Ser Lys Thr Phe Trp Leu 
3380 3385 3390 

cag cga gee ctg aaa ggc ate ctg ctg cgc agt gag tta tac ggt gee 12252 
Gin Arg Ala Leu Lys Gly lie Leu Leu Arg Ser Glu Leu Tyr Gly Ala 
3395 3400 3405 

gat ggc age age cag gee gat ate cct tac age gtc act gag tct cgc 123 00 
Asp Gly Ser Ser Gin Ala Asp lie Pro Tyr Ser Val Thr Glu Ser Arg 
3410 3415 3420 3425 

ccg cag gta egg eta gtt gaa gcg aat gga gac tac ccg gtg gtg tgg 12348 
Pro Gin Val Arg Leu Val Glu Ala Asn Gly Asp Tyr Pro Val Val Trp 
3430 3435 3440 

ccg atg ggc gcg gaa age cgt acg tea gtt tat gaa egg tac cac aat 123 96 
Pro Met Gly Ala Glu Ser Arg Thr Ser Val Tyr Glu Arg Tyr His Asn 
3445 3450 3455 

gat cct caa tgc caa cag cag gcg gta etc etc agt gat gaa tac ggt 12444 
Asp Pro Gin Cys Gin Gin Gin Ala Val Leu Leu Ser Asp Glu Tyr Gly 
3460 3465 3470 

ttc cca etc cct cac etc act etc aat tat cca cca ccc cct ccc tec 12492 
Fhe Pre Leu Arc GJ r, Vaj Ser VaJ Asn Ty: Pre Arc Arc Pre Pre Se: 
3475 3480 348S 

gcg gac aat cca tat ccg gcg tec tta ccg gcg acg ctg ttc gee aac 12540 
Ala Asp Asn Pro Tyr Pro Ala Ser Leu Pro Ala Thr Leu Phe Ala Asn 
3490 3495 3500 3505 

agt ta t g a r g a g c a g r a g r a g a ta ff a cgc ct g ggg ttg c a a c a g a ge 1 2S88 

Ser Tyr Asp Glu Gin Gin Gin lie Leu Arg Leu Gly Leu Gin Gin Ser 
3510 3515 3520 

agt gca cat cac ctt gtt tea ctg tct gag ggg cat tgg ttg ttg ggg 12636 
Ser Ala His His Leu Val Ser Leu Ser Glu Gly His Trp Leu Leu Gly 
3525 3530 3535 

ttg gcg gag gcg teg egg gac gat gta ttc acg tac tct gcg gac aac 12684 
Leu Ala Glu Ala Ser Arg Asp Asp Val Phe Thr Tyr Ser Ala Asp Asn 
3540 3545 3550 



gtg ccg gaa ggg ggt ctg acg ctg gaa cac ctg ttg gcg ccc gaa age 



12732 
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Val Pro Glu Gly Gly Leu Thr Leu Glu His Leu Leu Ala Pro Glu Ser 
3555 3560 3565 

ctg gtc teg gat agt cag gtc ggt acg ctg gcg ggt cag cag caa gtc 1278 0 
Leu Val Ser Asp Ser Gin Val Gly Thr Leu Ala Gly Gin Gin Gin Val 
3570 3575 3580 3585 

tgg tat ctg gat tea caa gac gtt gec acc gtc get get ccg cca etc 12828 

Trp Tyr Leu Asp Ser Gin Asp Val Ala Thr Val Ala Ala Pro Pro Leu 
3590 3595 3600 

ccc ccc aag gta get ttt ate gaa acg gee gtg ctg gat gag ggt atg 12 876 

Pro Pro Lys Val Ala Phe lie Glu Thr Ala Val Leu Asp Glu Gly Met 
3605 3610 3615 

gtc agt tea ctg get gee tac att gtg gat gaa cat etc gag caa gee 12924 

Val Ser Ser Leu Ala Ala Tyr lie Val Asp Glu His Leu Glu Gin Ala 
3620 3625 3630 

ggt tac egg caa tec gga tac ctt ttc cct cga ggc agg gaa gca gaa 12 972 

Gly Tyr Arg Gin Ser Gly Tyr Leu Phe Pro Arg Gly Arg Glu Ala Glu 
3635 3640 3645 

cag gca ttg tgg acc cag tgt cag gga tat gtt acc tat gee ggc gca 13 02 0 

Gin Ala Leu Trp Thr Gin Cys Gin Gly Tyr Val Thr Tyr Ala Gly Ala 
3650 3655 3660 3665 

gag cat ttc tgg eta ccg eta tec ttt egg gac agt atg ttg acc ggc 13 068 

Glu His Phe Trp Leu Pro Leu Ser Phe Arg Asp Ser Met Leu Thr Gly 
3670 3675 3630 

cca gtt acc gtg acg cgt gac gcg tac gac tgc gtc ate acg cag tgg 13116 

Pro Val Thr Val Thr Arg Asp Ala Tyr Asp Cys Val lie Thr Gin Trp 
3685 3690 3695 

cag gat gee gca ggg att gtc acc aca gee gac tat gac tgg cgc ttc 13164 

Gin Asp Ala Ala Gly lie Val Thr Thr Ala Asp Tyr Asp Trp Arg Phe 

3700 3705 3710 

etc a c c ccc etc cgc- etc acc cac ccc a a. t get aai etc cac ice etc j :- 2 1 7 i*W' 

Leu Thr Pro Val Arg Val Thr Asp Pro Asn Asp Asn Leu Gin Ser Val 
3715 3720 3725 

act ctg gat get ctg ggc egg gtg acc acc ctg cga ttc tgg ggc acg 13260 

Thr Leu Asp Ala Leu Gly Arg Val Thr Thr Leu Arg Phe Trp Gly Thr 

3730 3735 3740 3745 



gag aat ggt att gee acc ggt tac agt gat gee acg ttg tec gtt ccg 13 3 08 

Glu Asn Gly lie Ala Thr Gly Tyr Ser Asp Ala Thr Leu Ser Val Pro 

3750 3755 3760 

gac ggc gca gca gee get ctg gcg ttg acg gcg ccc eta cca gta gca 13 3 56 

Asp Gly Ala Ala Ala Ala Leu Ala Leu Thr Ala Pro Leu Pro Val Ala 

3765 3770 3775 



cag tgt ctg gtg tat gtc acg gac agt tgg gga gat gac gac aat gag 
Gin Cys Leu Val Tyr Val Thr Asp Ser Trp Gly Asp Asp Asp Asn Glu 
3780 3785 3790 



13404 
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aaa atg ccc ccg cac gtg gtc gtg ctg get acc gat cgc tat gac agt 13452 
Lys Met Pro Pro His Val Val Val Leu Ala Thr Asp Arg Tyr Asp Ser 
3795 3800 3805 

gat acc gga cag cag gtc cgc caa cag gtg aca ttc agt gac ggt ttt 13500 
Asp Thr Gly Gin Gin Val Arg Gin Gin Val Thr Phe Ser Asp Gly Phe 
3810 3815 3820 3825 

ggg cgt gag ttg caa teg gca acc egg cag gec gag ggc aac gec tgg 13548 
Gly Arg Glu Leu Gin Ser Ala Thr Arg Gin Ala Glu Gly Asn Ala Trp 
3830 3835 3840 

caa cga gga cgc gac ggc aaa ctg gtg acg gee agt gac gga ttg ccg 13 5 96 
Gin Arg Gly Arg Asp Gly Lys Leu Val Thr Ala Ser Asp Gly Leu Pro 
3845 3850 3855 

gtc act gta gca acg aat ttc cgc tgg gcg gtc acc ggg agg gcg gag 13 644 
Val Thr Val Ala Thr Asn Phe Arg Trp Ala Val Thr Gly Arg Ala Glu 
3860 3865 3870 



tat gac aat aaa ggt ctg cct gtt egg gtt tat cag ccg tat ttt ctg 
Tyr Asp Asn Lys Gly Leu Pro Val Arg Val Tyr Gin Pro Tyr Phe Leu 
3875 3880 3885 



gee gac acg cac ttt tac gat ccg acg gca egg gaa tgg cag gtt att 
Ala Asp Thr His Phe Tyr Asp Pro Thr Ala Arg Glu Trp Gin Val lie 
3910 3915 3920 



13692 



gac agt tgg caa tat gtc agt gat gac agt gee cgc cag gac ctg tat 13740 
Asp Ser Trp Gin Tyr Val Ser Asp Asp Ser Ala Arg Gin Asp Leu Tyr 
3890 3895 3900 3905 



13788 



acg gca aaa ggt gaa egg cga cag gtg ctg tat acc ccg tgg ttt gtg 13836 
Thr Ala Lys Gly Glu Arg Arg Gin Val Leu Tyr Thr Pro Trp Phe Val 
3925 3930 3935 

gtc act gaa cac gac aat cat acc gtt egg eta aac gac gca tec tea 32664 
Va J £e: G2u Ast GJ-J Ask Asr Th: Val GJy Let: Asr. A e t A] £ £ e : 
• 3940 3945 3950 

ctgggaagga gggggggacg gtg atg agt ccg teg ccc ctg aca ggc get gec 13 93 7 

Met Ser Pro Ser Pro Leu Thr Gly Ala Ala 
3955 3960 



-ctg atg ga g aca aa g at g a a~a—a t a— c ac— ta t~ ca g-g tt^g c g-gc g-g tt— g t g 1-3-9-8-5- 

Leu Met Glu Thr Lys Met Lys lie His Tyr Gin Val Ala Ala Val Val 
3965 3970 3975 

ctg aca ggt gtt atg gtt tgg ggg ctt tec cat tgg cgt tac acc gtc 14033 

Leu Thr Gly Val Met Val Trp Gly Leu Ser His Trp Arg Tyr Thr Val 
3980 3985 3990 3995 

ggt tac cac gcg gca gat act caa tgg caa caa cgc cag gee gaa cag 14 081 

Gly Tyr His Ala Ala Asp Thr Gin Trp Gin Gin Arg Gin Ala Glu Gin 
4000 4005 4010 



gaa agg gee gat gcg ttg gee etc ctg gca gca gaa acc egg gaa aga 



14129 
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# 

Glu Arg Ala Asp Ala Leu Ala Leu Leu Ala Ala Glu Thr Glu Arg 

4015 4020 4025 

aag tgg gag cag caa cga cag act gac atg aac aag gtg get ata cat 14177 
Lys Trp Glu Gin Gin Arg Gin Thr Asp Met Asn Lys Val Ala lie His 
4030 4035 4040 

get gaa gaa gaa ctg get get gcg cgt gac get gec get gat get cag 14225 
Ala Glu Glu Glu Leu Ala Ala Ala Arg Asp Ala Ala Ala Asp Ala Gin 
4045 4050 4055 

cgc act ggt cag cgc ctg cag cac acc gtt acc acc etc cag egg caa 14273 
Arg Thr Gly Gin Arg Leu Gin His Thr Val Thr Thr Leu Gin Arg Gin 
4060 4065 4070 4075 

ctt gee agt cgt gaa acc cgc cgc ctt tec gca get acc get ate ggt 14321 
Leu Ala Ser Arg Glu Thr Arg Arg Leu Ser Ala Ala Thr Ala lie Gly 
4080 4085 4090 

aca gac gac etc gga ggc caa ccc ggc gtt ttg ttt gee gaa ctg ttc 14369 
Thr Asp Asp Leu Gly Gly Gin Pro Gly Val Leu Phe Ala Glu Leu Phe 
4095 4100 4105 

cgc cgc get gac cag aga gcg gga gag ctg gca gcg tat get gac agg 14417 
Arg Arg Ala Asp Gin Arg Ala Gly Glu Leu Ala Ala Tyr Ala Asp Arg 
4110 4115 4120 

acc aga gtg aaa tgg cag gee tgc ggg cgc gee tat cag gcg get acg 144 65 
Thr Arg Val Lys Trp Gin Ala Cys Gly Arg Ala Tyr Gin Ala Ala Thr 
4125 4130 4135 

cac gaa gca gaa aaa taa ggegatttag ccgttaagga aaagtgacgg 14513 
His Glu Ala Glu Lys 
4140 4145 

tgttttcgcg attaatatta acaggagatc ac atg age aca tec ttg ttc agt 14566 

Met Ser Thr Ser Leu Phe Ser 

4150 

acc acc ccc tec etc ccc etc etc gac aac ccc ccc etc ttc etc ccc 24614 
Ser Thr Pro Ser Val Ala Val Leu Asp Asn Arg Gly Leu Leu Val Arg 
4155 4160 4165 

gag ctg cag tac tac cgc cat ccg gat aca ccg gag gag acg gac gag 14662 
Glu Leu Gin Tyr Tyr Arg His Pro Asp Thr Pro Glu Glu Thr Asp Glu 
4170 . 4-1-7-5 4180 — 

cgt ate acc tgc cat cag cac gat gag cgc ggc age ttg tea caa age 14710 
Arg lie Thr Cys His Gin His Asp Glu Arg Gly Ser Leu Ser Gin Ser 
4185 4190 4195 4200 

gee gac ccg egg tta cac gcg gee ggt ctg aca aat ttc acg tac ctg 14758 
Ala Asp Pro Arg Leu His Ala Ala Gly Leu Thr Asn Phe Thr Tyr Leu 
4205 4210 4215 

aat age ctg acc ggg aca gta ctg cag age gtc age gee gat gee ggt 14806 
Asn Ser Leu Thr Gly Thr Val Leu Gin Ser Val Ser Ala Asp Ala Gly 
4220 4225 4230 



£ 
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acg teg ctg gaa ctg age gat gec gec ggg egg gcg ttt ctg gee gtc 14 854 
Thr Ser Leu Glu Leu Ser Asp Ala Ala Gly Arg Ala Phe Leu Ala Val 
4235 4240 4245 

ace ggg get ggg acg gaa gac gcg gtc ace cgc ace tgg caa tat gaa 14 902 
Thr Gly Ala Gly Thr Glu Asp Ala Val Thr Arg Thr Trp Gin Tyr Glu 
4250 4255 4260 

gac gat ace ctg ccg ggc cgc ccg ctg age ate ace gag cag gtt acc 14 950 
Asp Asp Thr Leu Pro Gly Arg Pro Leu Ser lie Thr Glu Gin Val Thr 
4265 4270 4275 4280 

ggt gaa gec gec caa att acg gaa cgc ttc gtg tac get ggc aat acg 14998 
Gly Glu Ala Ala Gin I-le Thr Glu Arg Phe Val Tyr Ala Gly Asn Thr 
4285 4290" 4295 

gat gee gag aag att etc aat ctg get ggc cag tgt gtc agt cat tac 15 04 6 
Asp Ala Glu Lys lie Leu Asn Leu Ala Gly Gin Cys Val Ser His Tyr 
4300 4305 4310 

gat acc gee gga ctg gtg cag acg gac age ate gec ctg age ggc gtg 15094 
Asp Thr Ala Gly Leu Val Gin Thr Asp Ser lie Ala Leu Ser Gly Val 
4315 4320 4325 

ccg etc gee gtc acg egg cag ttg ctg ccc gac gcg gcg ggg gee aac 15142 
Pro Leu Ala Val Thr Arg Gin Leu Leu Pro Asp Ala Ala Gly Ala Asn 
4330 4335 4340 

tgg atg ggt gag gat gec teg gee tgg aat gac ctg ctg gat ggg gag 15190 
Trp Met Gly Glu Asp Ala Ser Ala Trp Asn Asp Leu Leu Asp Gly Glu 
4345 4350 4355 4360 



acg ttc ttc acc cag acc cac get gat gcg acc ggc gec gtc ctg age 
Thr Phe Phe Thr Gin Thr His Ala Asp Ala Thr Gly Ala Val Leu Ser 
4365 4370 4375 



gaa gaa cac ggc aac ggc gtg gta acc teg tat att tac gag ccg gaa 
Glu Glu His Gly Asn Gly Val Val Thr Ser Tyr He Tyr Glu Pro Glu 
4425 4430 4435 4440 



15238 



ate acc cat gca aaa gjr^ ; aat etc cac cgt etc gca tat cat etc get 15266 
j 3 e Thr Asr Ala Lys C(^- Asr. Lev Gin Arc £ $a 2 Ala Ty? Asr Vsj A3 a 
* 4380 4385 4390 

ggg ctg eta teg ggc agt tgg ttg acg ctg aag gac ggc acg gag cag 153 34 
Gly Leu Leu Ser Gly Ser Trp Leu Thr Leu Lys Asp Gly Thr Glu Gin 
4395 4400 4405 



-gtrc— ate— gtg gec t ec ctg ae g t a c t eg— g ec gee ggg aaa aag ttg cgt 1S3-82 

Val lie Val Ala Ser Leu Thr Tyr Ser Ala Ala Gly Lys Lys Leu Arg 
4410 4415 4420 



15430 



aca cag cgc ctg acg ggg att aaa acg gaa cgt ccg tct ggg cac gtt 15478 
Thr Gin Arg Leu Thr Gly lie Lys Thr Glu Arg Pro Ser Gly His Val 
4445 4450 4455 



gec gga gca aaa gtg ctg cag gac ctg cgc tat acg tat gac ccg gta 



15526 
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Ala Gly Ala Lys Val Leu Gin Asp Leu Arg Tyr Thr Tyr Asp 
4460 4465 4470 



r Asp 



Pro Val 



ggc aac gta etc age gtc aat aac gat gcg gaa gag acc cgc ttc tgg 15574 
Gly Asn Val Leu Ser Val Asn Asn Asp Ala Glu Glu Thr Arg Phe Trp 
4475 4480 4485 



cgt aac cag aaa gtg gta ccg gag aat acg tac ate tac gac age ctg 15622 
Arg Asn Gin Lys Val Val Pro Glu Asn Thr Tyr lie Tyr Asp Ser Leu 
4490 4495 4500 



tac cag ctg gtc age gee aca ggg 
Tyr Gin Leu Val Ser Ala Thr Gly 
4505 4510 

ca 9 99^ aac gac tta cca tec get 
Gin Gly Asn Asp Leu Pro Ser Ala 
4525 



cgt gag atg gee aat gee ggc cag 15670 
Arg Glu Met Ala Asn Ala Gly Gin 
4515 4520 

aca gee - ccc ctt cct aca gac age 15718 
Thr Ala Pro Leu Pro Thr Asp Ser 
4530 4535 



tct gee tac acc aat tac acg cgc acc tac cgt tat gac cgt ggt ggc 15766 
Ser Ala Tyr Thr Asn Tyr Thr Arg Thr Tyr Arg Tyr Asp Arg Gly Gly 
4540 4545 4550 



aac ctg acg cag atg cgc cac agt gec cct gec acg aac aat aat tat 15814 
Asn Leu Thr Gin Met Arg His Ser Ala Pro Ala Thr Asn Asn Asn Tyr 
4555 4560 4565 



acg aca gac ate acg gtt agt gac cgc age aat agg gcg gta ctg age 15862 
Thr Thr Asp lie Thr Val Ser Asp Arg Ser Asn Arg Ala Val Leu Ser 
4570 4575 4580 



G- 



acg ttg gcg gaa gtg ccg tea gat gtt gat atg ctg ttc agt gca gga 15910 

Thr Leu Ala Glu Val Pro Ser Asp Val Asp Met Leu Phe Ser Ala Gly 
4585 4590 4595 4600 

ggt cac cag aag cac ctg cag ccg ggg caa gca ctg gtg tgg acg cca 15958 

Gly His Gin Lys His Leu Gin Pro Gly Gin Ala Leu Val Trp Thr Pro 
4605 4610 4615 

cct ccc. Ceo etc caa aac: etc aca ccc etc etc cci cai ccc ccc ccc 1 6006 

Arg Gly Glu Leu Gin Lys Val Thr Pro Val Vai Arg Asp Gly Gly Ala 
4620 4625 4630 



gac gac age gaa age tat egg tat gat gcg ggc agt cag cgt att ate 16054 
Asp Asp Ser Glu Ser Tyr Arg Tyr Asp Ala Gly Ser Gin Arg He He 
4_6_3_5 4640 4645 

aaa acc ggc acg egg caa act ggc aac aac gtt cag aca cag egg gta 16102 
Lys Thr Gly Thr Arg Gin Thr Gly Asn Asn Val Gin Thr Gin Arg Val 
4650 4655 4660 



gtg tac ctg ccg ggg ctg gag tta cgt ate atg gca aat ggc gtg acg 16150 
Val Tyr Leu Pro Gly Leu Glu Leu Arg He Met Ala Asn Gly Val Thr 
4665 4670 4675 4680 



gaa aaa gaa age ctg cag gtt att acg gtg ggc gag get ggg egg gca 
Glu Lys Glu Ser Leu Gin Val He Thr Val Gly Glu Ala Gly Arg Ala 
4685 4690 4695 



16198 
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caa gtg cgc gta ttg cac tgg gag ate ggc aag ccg gat gac etc gat 16246 
Gin Val Arg Val Leu His Trp Glu lie Gly Lys Pro Asp Asp Leu Asp 
4700 4705 4710 

gag gac teg gtg cgt tac agt tac gat aac ctg gtg ggc age age cag 162 94 
Glu Asp Ser Val Arg Tyr Ser Tyr Asp Asn Leu Val Gly Ser Ser Gin 
4715 4720 4725 

ctg gag ctg gac aga gag ggt tac ctt ate agt gag gag gag ttc tac 16342 
Leu Glu Leu Asp Arg Glu Gly Tyr Leu lie Ser Glu Glu Glu Phe Tyr 
4730 4735 4740 

ccg tat ggc gga acg get gtt ctg acg gcg cga agt gag gtt gag get 163 90 
Pro Tyr Gly Gly Thr Ala Val Leu Thr Ala Arg Ser Glu Val Glu Ala 
4745 4750 4755 4760 

gac tac aaa act ate cga tac tea ggc aag gag cgt gac gcg acg ggg 164 3 8 
Asp Tyr Lys Thr lie Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly 
4765 4770 4775 

ctg gat tat tac ggt tat egg tat tac cag cca tgg gca ggg cgc tgg 164 86 
Leu Asp Tyr Tyr Gly Tyr Arg Tyr Tyr Gin Pro Trp Ala Gly Arg Trp 
4780 4785 4790 

etc tec acg gac ccg gca ggc acg gtg gac ggg ctg aac ctg ttc cgc 16534 
Leu Ser Thr Asp Pro Ala Gly Thr Val Asp Gly Leu Asn Leu Phe Arg 
4795 4800 4805 

atg gtg egg aat aat ccc gtc acg ctg ttt gac age aac ggg egg ate 165 82 
Met Val Arg Asn Asn Pro Val Thr Leu Phe Asp Ser Asn Gly Arg lie 
4810 4815 4820 

agt act ggt cag gag gee aga cga tta gtg ggg gaa gca ttt gtt cat 16630 
Ser Thr Gly Gin Glu Ala Arg Arg Leu Val Gly Glu Ala Phe Val His 
4825 4830 4835 4840 

ccc tta cac ate cct ctt ttt gaa aca att tct eta cac aca aac att 16676 
Pre Leu Ki f Mei Pre Val Phe G3u Arc ] je Se: Val GJu Arc Lys lie 
4845 4S50 4655 

tea atg age gta agg gaa get ggc att tat act att tea gcg ctg ggt 16726 
Ser Met Ser Val Arg Glu Ala Gly lie Tyr Thr lie Ser Ala Leu Gly 
4860 4865 4870 



gaa ggt gca gca gca aaa ggc cat aat att eta gag aaa ace att aaa 16774 
Glu Gly Ala Ala Ala Lys Gly His Asn lie Leu Glu Lys Thr lie Lys 
4875 4880 4885 

ccc ggt tec ctg aag get ate tat ggt gat aaa get gag tea att ctt 16822 
Pro Gly Ser Leu Lys Ala lie Tyr Gly Asp Lys Ala Glu Ser lie Leu 
4890 4895 4900 

gga ctg gca aaa cgt age ggt etc gtt ggc cga gta gga cag tgg gat 16870 
Gly Leu Ala Lys Arg Ser Gly Leu Val Gly Arg Val Gly Gin Trp Asp 
4905 4910 4915 4920 



gca tea ggt gta cgt gga att tat gcg cac aac aga ccg ggt ggt gag 



16918 
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lie Tyr Ala His Asn Arg Pro 



la Ser Gly Val Arg lie Tyr Ala His Asn Arg Pro Gly Glu 

4925 4930 4935 

gat ttg gtt tat cct gtc age ctg cag aat act tct gec aat gaa att 16966 
Asp Leu Val Tyr Pro Val Ser Leu Gin Asn Thr Ser Ala Asn Glu He 
4940 4945 4950 

gtt aat gca tgg ata aaa ttt aaa ate ate acg ccc tac ace ggg gat 17014 
Val Asn Ala Trp He Lys Phe Lys He He Thr Pro Tyr Thr Gly Asp 
4955 4960 4965 

tat gac atg cac gat att att aaa ttc tct gat ggg aaa ggg cat gtg 17062 
Tyr Asp Met His Asp He He Lys Phe Ser Asp Gly Lys Gly His Val 
4970 4975 4980 

cct aca gcg gaa agt agt gag gaa aga gga' gta aaa gat eta att aat 17110 
Pro Thr Ala Glu Ser Ser Glu Glu Arg Gly Vai Lys Asp Leu He Asn 
4985 4990 4995 5000 

aaa ggt gtt gcg gag gtc gat cct tec aga ccc ttt gag tat aca gcg 1715 8 
Lys Gly Val Ala Glu Val Asp Pro Ser Arg Pro Phe Glu Tyr Thr Ala 
5005 5010 5015 

atg aat gtt att cgc cat gga cca cag gtg aac ttt gtt ccc tat atg 17206 
Met Asn Val He Arg His Gly Pro Gin Val Asn Phe Val Pro Tyr Met 
5020 5025 5030 

tgg gaa cat gag cac gat aaa gtc gtt aat gat aat ggt tat ctg ggg 172 54 
Trp Glu His Glu His Asp Lys Val Val Asn Asp Asn Gly Tyr Leu Gly 
5035 5040 5045 

gtg gta get age ccg ggg ccg ttc ccg gta gcg atg gta cat cag ggg 173 02 
Val Val Ala Ser Pro Gly Pro Phe Pro Val Ala Met Val His Gin Gly 
5050 5055 • 5060 

gaa tgg act gtt ttt gac aac agt gaa gaa ctg ttt aat ttc tat aaa 17350 
Glu Trp Thr Val Phe Asp Asn Ser Glu Glu Leu Phe Asn Phe Tyr Lys 
5065 5070 5075 5080 

tct aca aa\ aca cci eti cci caa cac tec tec caa ca: ttt etc cac 3 73 9E 
Ser Thr Asn Thr Pro Leu Pro Glu Kis Trp Ser Gin Asp Phe Met Asp 
5085 5090 5095 

aga ggg aaa gga ata gtc gca act cct egg cat get gaa ctt ctt gat 17446 
Arg Gly Lys Gly He Val Ala Thr Pro Arg His Ala Glu Leu Leu Asp 
5100 5105 5JJL0 



aaa cga cga gtc atg tac taa tegtaacgat ttcctgcctt acccaaagta 174 97 

Lys Arg Arg Val Met Tyr 
5115 

tacagcccgg tgagacattt tctctgtctc atttgggttg tttttgtctc atetgeatgt 17557 
tatgtcttcc ctcatctaaa gtctaacgag acatttttag caaaatggca etttaeggtt 17617 
atgttcgcgt ttcaaccgac ggtceggatt ttactctgta aatacagaca cttcgcgcag 17677 
cctgctgcga aattatccgt gcgaaaaaag ccagcggcag cagcegggat ggacgaaatg 17737 
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aactgcagct tctgctggct tttttgcggc caggcaacat gctgatggtt acgtgagttg 17797 
atcggctgcc accaaaaagt ccggagcgtg cggcccagat cgccgcaata atactgctgt 17857 
atggtatttc catcaccact gtatatcgca cactctgggc cttccagaaa ccccataccg 17917 
cacaccggtg tgatcgctgg aagccccggg cattaccgcc gtctgtactc gaacactatt 17977 
gtggacttga tggttaggag attgaatcga ccatttttga gatccctaac catagatcgt 18037 
agagttgcac actcccagat ggcgtggctt agcgagcgat tatgcttaaa aattcatgtt 18097 
ttgctgtgtt tttaatccaa aacctgcttt tcaggcgcac ttatccagct acggggtctg 18157 
aagccatcgt ttttttgccg tacgatgtag cctgtcagag agcatttttg tggcgtgctc 18217 
gcccgctacg gtaccggcgg caaaacgcag ccggcctttg cagaggatgc actggtacgg 18277 
atcggtgccc aggaagcctt tcatcagcac cgcgaacccg ggccgtttcg gtttctcccg 18337 
taccgtcatc tccagcgcgt cgtaaacctt cggcagcagc gtgcccgttt gcggttggcc 18397 
agaaaaccat agtaacgcac cattttaaaa tgccgtgcag ggatatggct gacgtaacgc 18457 
tgcagcatct cctcctggct gattttctgg cgtttgtgct gctgcgtacg gtgatcgtaa 18517 
tactgatgca ccacggcccc gccgcggtag tggcgtagct gagaagccgc caccggcggg 18577 
cgcttcaggt accgggccag gtatttcacg ctgcgccagg cgccgcgggt ctttttggca 18637 
aaattcactt tccaggggcg gcggtattgc gcatgcaggg tcttcgttgc ggatatggcc 18697 
gagacccggc agggcgccag gattgatgcg cagcaggtga acgacggcat tgcgccagat 18757 
ggcttccacc tctttctttt taaagaacag ctgccgccag acgtggtgtt tgacgtcaag 18817 
accgccgcgc eta aegcaca cgtgoatetc cgcatcttca ttgacctccc ccccttacat 3 8877 
gt-ggagegee caaaaaatgc cggcctcgai gccctgccgg cgcgcccagc ggagcatggc I5S27 
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2) INFORMATION F^^EQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 144 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (ORF 1) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 

Met Lys lie Ser Ser Arg Gly lie Ala Leu lie Lys Glu Phe Glu Gly 
1 5 10 15 

Leu Arg Leu His Ala Tyr Arg Cys Ala Ala Asp Val Trp Thr Val Gly 
20 - 25 30 

Tyr Gly His Thr Ala Gly Val Thr Lys Gly Asp He He Thr Val Asp 
35 40 45 

Glu Ala Gin Thr Met Leu Thr Asn Asp He Thr Val Phe Glu Arg Ala 
50 55 60 

Val Ser Gin Ala Val Ala Val Pro Leu Asn Gin Ser Gin Tyr Asp Ala 
65 70 75 80 

Leu Val Ser Leu Val Phe Asn He Gly Gin Gly Asn Phe Lys Arg Ser 
85 90 95 

Thr Leu Leu Lys Lys Leu Ash Lys Gin Asp Tyr Val Gly Ala Gly Asn 
100 105 110 

Glu Phe Leu Arg Trp Thr Arg Ala Asn Gly Lys Val Leu Pro Gly Leu 
115 120 125 

He Arg Arg Arg Glu Ala Glu Arg Val Leu Phe Glu Lys Leu Gly 
130 135 140 

Ala 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (ORF 2) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

Met Ser Pro Ser Pro Leu Thr Gly Ala Ala Leu Met Glu Thr Lys Met 

! 5 ' 10 . 15 

Lys lie His Tyr Gin Val Ala Ala Val Val Leu Thr Gly Val Met Val 
20 25 30 

Trp Gly Leu Ser His Trp Arg Tyr Thr Val Gly Tyr His Ala Ala Asp 
35 40 45 

Thr Gin Trp Gin Gin Arg Gin Ala Glu Gin Glu Arg Ala Asp Ala Leu 



50 



55 60 



Ala Leu Leu Ala Ala Glu Thr Arg Glu Arg Lys Trp Glu Gin Gin Arg 

75 au 



65 70 



Gin Thr Aso Met Asn Lys Val Ala He His Ala Glu Glu Glu Leu Ala 
85 90 95 

Ala Ala Arg Asp Ala Ala Ala Asp Ala Gin Arg Thr Gly Gin Arg Leu 

105 110 



100 



Gin His Thr Val Thr Thr Leu Gin Arg Gin Leu Ala Ser Arg Glu Thr 

HE 120 1Z ~ 

- - r " \ • rr " v- - ti c - t <= * Leu G 1 y Gi v 

pre Arc Leu Se: Aia Aa& in: - - fc - 

" 130 135 "0 



Gin «ro Gly Val Leu Phe Ala Glu Leu Phe Arg Arg Ala Asp Gin Arg 
145 150 1S ° 

Ala Glv Glu Leu Ala Ala Tyr Ala Asp Arg Thr Arg Val Lys Trp Gin 



165 



Ala Cys Gly Arg Ala Tyr Gin Ala Ala Thr His Glu Ala Glu Lys 
180 I 35 190 
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(2) INFORMATION FOR SEQ ID NO : 4; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2376 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (SepA) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 



Met Arg Gin Asp lie Met Tyr Asn lie Asp Asp lie Leu Glu Lys Val 
15 10 15 

Asn Ala Pro Arg Ala Arg Leu Ser Glu Glu Asn Asp Thr Ala Val Thr 
20 25 30 

Leu Thr Asp Leu Phe Ser Arg Ser Phe Pro Glu Val Lys Lys lie Thr 
35 40 45 

Gly Asp Ser Leu Ser Trp Gly Glu Val Cys Tyr Leu Tyr Ser Gin Ala 
50 55 60 

Gin His Glu Gin Lys Glu Asn Arg Leu Thr Glu Ser Arg lie Leu Ala 
65 70 75 80 

Arg Ala Asn Pro Leu Leu Val Asn Ala Val Arg Leu Gly lie Arg Gin 
85 90 95 

Ala Ala Gly Ser Arg Ser Tyr Asp Asp Trp Phe Gly Ser Arg Ala Asp 
100 105 110 

Arg Phe Ala Arg Pro Gly Ser Val Ala Ser Met Phe Ser Pro Ala Ala 
115* 120 125 

Tyr Leu Thr Glu Leu Tyr Arc Gju hi a Lye Asp Leu His Pre Asp Thr 

j j C J ? r J 4 C 

Ser Leu Phe Arg Leu Asp lie Arg Arg Pro Asp Leu Ala Ala Leu Ala 
145 150 155 160 

Leu Ser Gin Asn Asn Met Asp Asp Glu Leu Ser Thr Leu Ser Leu Ser 

1-6-5 1 7 0 175 



Asn Glu Leu Leu Tyr Arg Gly lie Gly Ala Ala Glu Gly Leu Asp Asp 
180 185 190 
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Asp Ser Val Arg 
195 

Pro Tyr His Trp 
210 

Asp Pro Thr Leu 
225 

Met Asp Pro Ala 



Glu Leu Leu Ala 
200 

Ala Tyr Glu Ala 
215 

Met Gly Phe Ser 
230 

Ser Met Leu Ala 
245 



Gly Tyr Arg Leu 



Ala Arg Gin Ala 
220 

Arg Asn Pro Asp 
235 

lie Glu Ala Asp 
250 



Thr Gly Leu Thr 
205 

lie Leu Val Gin 



Val Ala Gin Leu 
240 

lie Ser Pro Glu 
255 



Leu Tyr Gin lie 
260 

Leu Trp Ser Lys 
275 

Tyr Asp Ala Leu 
290 

Ser Leu Leu Ser 
305 

Tyr lie Asn Ser 



Leu He Thr He 
340 

Gin He Asn Pro 
355 

Asn Phe Ser Val 
370 

Ser Leu Gly Ser 

3 Eh 



Leu Ala Glu Glu 



Asn Phe Gly Asp 
280 

Ala Thr Phe Tyr 
295 

Leu Arg Leu Asp 
310 

Gin Leu Ser Val 
325 

His His Tyr Leu 



Glu Leu He Pro 
360 

Val Ser Thr lie 
375 

Asn Ser Ser Asn 

2 9 0 



He Thr Thr Asp 
265 

Met Pro Pro Ser 



Asp Leu Asp Tyr 
300 

Phe Ser Asn Pro 
315 

Val Thr Leu Asn 
330 

Arg Thr Leu Gly 
345 

Tyr Gly Asp Gly 



Ser Glu Asp Ser 
380 

Leu Tyr Ser Gly 



Ser Tyr Glu Ala 
270 

Ser Leu Leu Ser 
285 

Asp Glu Leu Thr 



Asn Asn Glu Tyr 
320 

Glu Ser Thr Gly 
335 

Gly Asp Ser Gin 
350 

Thr Tyr Leu Tyr 
365 

Phe Lys Leu Gly 



Asp Tyr Gin Leu 

4 00 



Gin Lys Gly Val Arg Tyr Ser He Pro Val Glu He Asp Glu Gly Lys 

405 410 415 

Leu Asn Asp Gly He Thr He Gly Leu Ser Arg Lys Gly Gly Gly Tyr 
420 425 " 430 

Tyr Ser Thr Val Asn P he Thr Leu He Glu Tyr Asp Fro Ala lie Phe 
435 440 445 

He Leu Lys Leu Asn Lys Val He Arg Leu Tyr Lys Ala Thr Gly Met 
450 455 460 

Thr Thr Ala Glu He Tyr Gin lie Thr Asn He Leu Asn Asn Gly Leu 
465 470 475 480 

Thr He Asp His Ala Val Leu Ser Lys He Phe Leu Val Arg Tyr Leu 

485 490 495 



Met Arg His Tyr Gin Leu Asp Val Ala Arg Ser Leu He Leu Cys Asn 
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500 505 



f 



Gly Thr lie Ser Asp Gin Ala Phe Ser Gly Glu Thr Gly Leu Phe Thr 
515 520 525 

Thr Leu Phe Asn Thr Pro Pro Leu Asn Gly Gin Leu Phe Ser Ala Asp 
530 535 540 

Asp Thr Pro Leu Asp Leu Arg Ser Glu Ala Pro Glu Asp Ala Phe Arg 
545 550 555 560 

Leu Ser Val Leu Lys Arg Ala Phe Asn lie Ser Ala Ser Gly Leu Ser 
565 570 575 

Thr Leu Trp Gin Leu Ala Ser Gly Asp Ser. Ser Ala Gly Phe Ser Cys 
580 585 590 

Ser Ala Asp Asn lie Ala Ala Leu Tyr Arg Val Lys Leu Leu Ala Asp 
595 600 605 

lie His Asp Leu Ser Ala Gly Glu Leu Ser Met Leu Leu Ser Val Ser 
610 615 620 

Pro Phe Ser Gly Val Ala Ala Gly Ser Leu Ser Asp Asn Glu Leu Thr 
625 630 635 640 

Gin Phe Leu Tyr Gin Thr Thr Thr Trp Leu Thr Glu Gin Gly Trp Thr 
645 650 655 

Val Ser Asp Val Phe Leu Met Leu Thr Thr Gin Tyr Gly Thr Leu Leu 
660 665 670 

Thr Pro Asp lie Glu Asn Leu Leu Ala Ser Leu Arg Asn Gly Leu Ser 
675 680 685 

Gly Arg Glu Leu Phe Pro Glu Thr Leu Pro Gly Asp Gly Ala Pro Phe 
690 695 700 

lie Ala Ala Ala Met Gin Leu Asp Ala Thr Asp Thr Ala Lye Ala Met 

~0i 71C 7 11 

Leu Thr Trp Ala Asp Gin Leu Lys Pro Glu Gly Leu Thr Leu Thr Glu 
725 730 735 

Phe lie Leu Leu Val Met Asn Ala Ala Pro Asn Asp Glu Gin Ala Gly 

7~40 ~ 7"4-5 — 750 

Gin Met Ala Gly Phe Cys Gin Ala Leu Trp Gin Leu Ala Leu lie lie 
755 760 765 

Arg Ser Thr Gly Leu Ser Thr Arg Glu Leu Thr Leu Leu Val Ser Gin 
770 775 780 

Pro Gly Arg Phe Arg Thr Gly Trp His His Leu Pro His Asp Leu Pro 
785 790 795 800 



Ala Leu Arg Asp lie Thr Arg Phe His Ala Val Val Asn Arg Ser Gly 
805 810 815 
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Ser His Ala Gly Glu Val Leu Thr Ala Leu Glu Thr Gly Glu Leu Ser 
820 825 830 

Ser Ala Leu Leu Ala Arg Ala Leu Ser Gin Asn Glu Gin Asp Val Thr 
835 840 845 

Gly Ala Leu Ala Gin Val Arg Gly Ala Gly Glu Gin Asp Asn Ser Val 
850 855 860 

Phe Thr Ser Trp Glu Glu Val Asp Gin Ala Glu Gin Trp Leu Asp Met 
865 870 875 880 

Ser Glu Thr Leu Ser lie Thr Pro Ser Gly Leu Ala Ser Leu lie Ala 
885 * 890. 895 

Leu Lys Tyr lie Asn Val Ser Asp Asp Ser Ala Pro Leu Tyr Ser Gin 
900 905 910 

Trp Gin Val Val Ser Gly Leu Leu Gin Ala Gly Leu Lys Ser Ser Gin 
915 920 925 

Ser Ser Ala Leu His Asp Tyr Leu Glu Glu Gly Thr Ser Ser Ala Leu 
930 935 940 

Cys Ala Tyr Tyr Leu Arg Asn Leu Ala Pro Asn Met Val Ser Gly Arg 
945 950 955 960 

Asp Asp Leu Phe Gly Tyr Leu Leu Leu Asp Asn Gin Val Ser Ala Lys 
965 970 975 

Val Lys Thr Thr Arg lie Ala Glu Ala lie Ala Gly lie Arg Leu Tyr 
980 985 990 

He Asn Arg Ala Leu Asn Gly He Glu Leu Ser Ala Met Ala Glu Val 
995 1000 1005 

Arc Gly Arc Gin Phe Phe Thr Asp Trp Asp Thr Phe Asn Lys Arc Tyr 
ICiC j C J I 3 c : c 

Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr 
025 1030 1035 1040 

Leu Asp Pro Thr Val Arg lie Gly Gin Thr Gly Met Met Asp Thr Leu 
1045 1050 1055 



Leu Gin Ser Val Ser Gin Ser Ser He Asn Arg Asp Thr Val Glu Asp 
1060 1065 1070 

Ala Phe Lys Thr Tyr Leu Thr Thr Phe Glu Gin He Ala Asn Leu Asn 
1075 1080 1085 

Thr Val Ser Gly Tyr His Asp Asn Ala Ser Met Thr Gin Gly Thr Thr 
1090 1095 1100 



Trp Tyr Val Gly Arg Ser He Thr Asp Gin Thr Asn Trp Tyr Trp Arg 
105 1110 1115 1120 
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Ser Ala Asn His Ser Lys lie Gin Asp Ser Met Met Pro Ala Asn Ala 
1125 1130 1135 

Trp Thr Gly Trp Thr Lys lie Asn Cys Gly Met Asn Pro Trp Ser Asp 
1140 1145 1150 

Leu Val Cys Ser Val Phe Phe Asn Ser Arg Leu Tyr Val Val Trp Val 
1155 1160 1165 

Glu Glu Asn Gin Ser Ala Asp Thr Glu Ala Glu Ser Thr Thr Thr Thr 
1170 1175 1180 

Gin Gin Ser Tyr Thr Leu Lys Leu Ser Phe Arg Arg Tyr Asp Gly Thr 
185 1190 1195 1200 

Trp Ser Ser Pro Val Ser Phe Asp lie Thr Gly Asn lie Ala Phe Pro 
1205 1210 1215 

Glu Thr Gin Gly Met His Val Thr Cys Asn Pro Leu Thr Glu Gin Leu 
1220 1225 1230 

Tyr Cys Ala Phe Tyr Ser Val Thr Ser Lys Pro Asp Phe Asp Asn Ala 
1235 1240 1245 

Gin Leu lie Ser Val Asp Asn Asp Met Thr Leu Asn Val lie Ser Asp 
1250 1255 1260 

lie Gly lie Phe Lys Ser Val Ser His Glu Phe Asn Thr Ser Thr Glu 
265 1270 1275 1280 

Lys Phe lie Asn Asn Val Phe Ser Asp Pro Ser Ala Asn Tyr Phe Val 
1285 1290 1295 

Ser Ala Thr Ser Leu lie Asp Asp Val lie His Ser Asp Phe Ser Leu 
1300 1305 1310 

Leu Asn Ser Lys Thr Thr Ser Thr Val Phe Thr Asn Glu Asp Ser Ser 

1315 1320 1325 

Lev hev Thr Pre Ql'c Leu Hi z 12c Thr Va] £e: Cys Fhe VaJ 

1330 1335 1340 

Ser Thr Ala Gly lie Ala Thr Gin Ser Thr lie Glu Lys Phe Val Gin 
345 1350 1355 1360 



~A la Gly lie Glu Phe Glu Glu ITe~Asn Phe Tyr Ala Gly Gin Ala Ala 
1365 1370 1375 

Gly Gly Phe Asp Gly Phe Val Gly Val Asp Val Ser Asn Ser Lys Val 
1380 1385 1390 

Tyr Gin Val Gly Lys Glu Ala Val Gly Val Thr Val Lys Ser Tyr Ser 
1395 1400 1405 

Val Thr Gly Val Ser Gly Ser Val Glu Leu Phe lie Asp Ser Ser Asn 
1410 1415 1420 



Lys Tyr Phe Ser Gly lie Leu Ser Asp Lys Met lie Thr Ala Leu He 



-66- 



425 1430 1435 1440 

Ser Gly Ser Thr Ser Lys Val Asn Tyr Val Ser Ser lie Gly Ser Gin 
1445 1450 1455 

Asp Phe Trp Ser Val Lys Ser Leu Met Pro Ala Leu Gin lie Tyr Glu 
1460 1465 1470 

Leu lie Asp Asp lie lie Leu Thr Ser Gly Val Asn Gly Thr Glu lie 
1475 1480 1485 

Lys Ser Trp Pro Ser Ala Glu Trp Tyr Asn Asp Lys Leu Ser Leu Gin 
1490 1495 1500 

Ser Gly Asn Asn Leu Phe Asn Thr Lys Ser Leu Ser Phe Thr Val Asn 
505 1510 1515 1520 

Thr Ser Asp lie Val Glu Asp Glu Phe Asp Val Thr Phe Thr Phe Thr 
1525 1530 1535 

Ala Val Asp Gin Asn Asn Val Val Leu Ala Ala Arg Thr Ala lie Leu 
1540 1545 1550 

Thr Val lie Arg Asn lie Asn Asn Asp Thr Ser Val lie Ala Leu Arg 
1555 1560 1565 

Lys Asn Thr Arg Gly Ala Gin Tyr lie Arg Phe Thr Ala Gly Asn Asp 
1570 1575 1580 

Val Ala Leu lie Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Asp 
585 1590 1595 1600 

Arg Ala Asn Thr Gly lie Asp Thr lie Leu Ser Met Glu Thr Gin Arg 
1605 1610 1615 

Leu Thr Glu Pro Ala Leu Glu Glu Gly Ser Asp Val Phe Met Asp Phe 

1620 1625 1630 

Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyj Tyi Thi Pre 
1635 1640 1645 

Met Met Val Phe Gin Arg Leu Leu Gin Glu Gin His Phe Pro Glu Ala 
1650 1655 1660 

Thr Arg Trp Leu Gin Tyr Val Trp Asn Pro Ala Gly His Val Val Asn 
665 1670 ■ 1675 1680" 

Gly Val Leu Gin Asn Tyr Thr Trp Asn Val Arg Pro Leu Glu Glu Asp 
1685 1690 1695 

Thr Gly Trp Asn Asp Ser Pro Leu Asp Ser lie Asp Pro Asp Ala lie 
1700 1705 1710 

Ala Gin Tyr Asp Pro Met His Tyr Lys Val Ala Thr Phe Met Ser Tyr 
1715 1720 1725 

Leu Asp Leu Leu He Ala Arg Gly Asp Ala Ala Tyr Arg Leu Leu Glu 
1730 1735 1740 
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Arg Asp Thr Leu Asn Glu Ala Arg Met Trp Tyr Val Gin Ala Leu Asn 
745 1750 1755 1760 

Leu Leu Gly Asp Glu Pro Tyr He Ser Phe Asp Ala Asp Trp Ser Ala 
1765 1770 " 1775 

Leu Thr Leu Gly Asp Ala Ala Ser Glu Val Thr Arg Arg Asp Tyr Gin 
1780 1785 1790 

Glu Ala Leu Leu Ala Val Arg Arg Leu Val Pro Ala Pro Glu Thr Arg 
1795 1800 1805 

Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Gin Asn Glu Val 
1810 " 1815 1820 

Leu Lys Gly Tyr Trp Gin Thr Leu Ala Gin Arg Leu His Asn Leu Arg 
825 1830 1835 1840 

His Asn Leu Ser He Asp Gly Gin Pro Leu Ser Leu Ser Val Tyr Ala 
1845 1850 1855 

Thr Pro Ser Glu Pro Ser Ala Leu Gin Ser Ala Val Val Asn Ser Ala 
I860 1865 1870 

Gin Gly Ala Ala Ala Leu Pro Ala Ala Val Met Pro Leu Tyr Ser Phe 
1875 1880 1885 

Pro Val Met Leu Glu Asn Ala Arg Gly Met Val Ser Leu Leu Thr Gly 
1890 1895 1900 

Phe Gly Asn Thr Leu Leu Gly He Thr Glu Arg Gin Asp Ala Glu Ala 
3°5 1910 1915 1920 

Leu Ala Lys Leu Leu Gin Thr Gin Gly Ser Glu Leu He Arg Gin Gly 
1925 1930 1935 

Leu Arg Gin Gin Asp Asn Val Leu Glu Glu He Asp Ala Asp He Ala 

194 0 d 94 b I 9 s C 

Ala Leu Glu Glu Ser Arg Arg Gly Ala Gin Met Arg Phe Glu Arg Tyr 
1955 1960 1965 

Lys Val Leu Tyr Glu Ala Asp Val Asn Thr Gly Glu Lys Gin Ala Met 
1970 1975 1980 

Asp Leu Tyr Leu Ser Ser Ser Val Leu Ser Ala Ser Thr Ala Ala Leu 
985 1990 1995 2000 

Phe Leu Ala Glu Ala Ala Ala Asp Met Leu Pro Asn He Tyr Gly Leu 
2005 2010 2015 

Ala Val Gly Gly Ser Arg Tyr Gly Ala Leu Phe Lys Ala Thr Ala lie 
2020 2025 2030 



Gly He Gin Val Ser Ser Asp Ala Thr Arg He Ser Ala Asp Lys He 
2035 2040 2045 
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^^^^^Ser Gin Ser Glu Val Tyr Arg Arg Arg Arg Glu Glu Trp Glu lie Gin 

2050 2055 2060 

Arg Asp Ser Ala Gin Ser Asp Val Ala Gin lie Asp Ala Gin Leu Ala 
065 2070 2075 2080 

Ala Met Ala Val Arg Arg Glu Gly Ala Glu Leu Gin Lys Thr Tyr Leu 
2085 2090 2095 





Glu 


Thr Gin Gin 


Thr 


Gin 


Ala Gin 


Ala 


Gin Leu 


Ala Phe 


Leu Gin Ser 






2100 






2105 




2110 




Lys 


Phe Asn Asn 


Thr 


Ala 


Leu Tyr 


Ser 


Trp Leu 


Arg Gly 


Arg Leu Ser 






2115 






212 0 






2125 






Ala 


lie Tyr Tyr 


Gin 


Phe 


Tyr Asp 


Leu 


Ala Val 


Ser Arg 


Cys Leu Met 




2130 




2135 




2140 






Ala 


Gin Gin Ala 


Trp 


Gin 


Trp Asp 


Lys 


Phe Glu 


Thr Arg 


Ser Phe He 




145 




2150 






2155 




2160 




Gin 


Pro Gly Ala 


Trp 


Met 


Gly Ala 


Asn 


Ala Gly 


Leu Leu 


Ala Gly Glu 






2165 






2170 




2175 




Thr 


Leu Met Leu 


Asn 


Leu 


Ala Gin 


Met 


Glu Gin 


Ala Trp 


Leu Thr Gly 






2180 






2185 




2190 




Asp 


Glu Arg Ala 


He 


Glu 


Val Thr 


Axg 


Thr Val 


Cys Leu 


Ser Glu Val 






2195 






2200 






2205 






Tyr 


Thr Ser Leu 


Ala 


Glu 


Asp Ala 


Ala 


Phe Ser 


Leu Ala 


Asp Lys Val 




2210 




2215 




2220 






Val 


Glu Leu Val 


Ser 


Asn 


Gly Ser Gly 


Ser Ala 


Gly Thr 


Lys Ser Asn 




225 




2230 






2235 




2240 




Gly 


Leu Gin Met 


Asp 


Gin 


Gin Gin 


Leu 


Glu Ala 


Thr Leu 


Lys Leu Ala 






2241 






2 2 5 C 




22E; 


• 


Asp 


Leu Gly lie 


Gly 


Asn 


Asp Tyr 


Pro 


Val Ser 


Leu Gly 


Thr Met Arg 






2260 






2265 




2270 




Arg 


lie Lys Gin 


lie 


Ser 


Val Thr 


Leu 


Pro Ala 


Leu Val 


Gly Pro Tyr 






2275 






2280 






2285 






G±nr 


Asp Val Arg 


Ala 




Leu Ser 


Tyr 


Gly Gly 


ser Met 


val Met Pro 




2290 




2295 




2300 






Arg 


Gly Cys Ser 


Ala 


Leu 


Ala Val 


Ser 


His Gly 


Met Asn 


Asp Ser Gly 




305 




2310 






2315 




2320 




Gin 


Phe Gin Leu 


Asp 


Phe 


Asn Asp 


Pro 


Arg Tyr 


Leu Pro 


Phe Glu Gly 






2325 






2330 




2335 




Leu 


Pro Val Asp 


Asp 


Thr 


Gly Thr 


Leu 


Thr Leu 


Ser Phe 


Pro Asp Ala 






2340 






2345 




2350 




Asp 


Gly Lys Gin 


Gin 


Ala 


Met Leu 


Leu 


Ser Leu 


Ser Asp 


He He Leu 






2355 






2360 






2-3^6 S 





His He Arg Tyr Thr He He Ser 
2370 2375 
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(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1429 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (SepB) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 



Met Gin Asn His Gin Asp Met Ala lie Thr Ala Pro Thr Leu Pro Ser 
15 10 15 

Gly Gly Gly Ala Val Thr Gly Leu Lys Gly Asp lie Ala Ala Ala Gly 
20 25 30 

Pro Asp Gly Ala Ala Thr Leu Ser lie Pro Leu Pro Val Ser Pro Gly 
35 40 45 

Arg" Gly Tyr Ala Pro Thr Gly Ala Leu Asn Tyr His Ser Arg Ser Gly 
50 55 60 
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Asn Gly Pro Phe Gly lie Gly Trp Gly lie Gly Gly Ala Ala Val Gin 
65 70 75 80 

Airg Arg Thr Arg Asn Gly Ala Pro Thr Tyr Asp Asp Thr Asp Glu Phe 
85 90 95 

Thr Gly Pro Asp Gly Glu Val Leu Val Pro Ala Leu Thr Ala Ala Gly 
100 105 110 

Thr Gin Glu Ala Arg Gin Ala Thr Ser Leu Leu Gly He Asn Pro Gly 
115 120 125 

Gly Ser Phe Asn Val Gin Val Tyr Arg Ser Arg Thr Glu Gly Ser Leu 
130 135 140 

Ser Arg Leu Glu Arg Trp Leu Pro Ala Asp Glu Thr Glu Thr Glu Phe 
145 ~ 150 155 160 

Trp Val Leu Tyr Thr Pro Asp Gly Gin Val Ala Leu Leu Gly Arg Asn 
165 170 175 

Ala Gin Ala Arg He Ser Asn Pro Thr Ala Pro Thr Gin Thr Ala Val 
180 185 190 

Trp Leu Met Glu Ser Ser Val Ser Leu Thr Gly Glu Gin Met Tyr Tyr 
195 200 205 

Gin Tyr Arg Ala Glu Asp Asp Asp Gly Cys Asp Glu Ala Glu Arg Asp 
210 215 220 

Ala His Pro Gin Ala Gly Ala Gin Arg Tyr Pro Val Ala Val Trp Tyr 
225 230 235 240 

Gly Asn Arg Gin Ala Ala Arg Thr Leu Pro Ala Leu Val Ser Thr Pro 
245 250 255 

Ser Met Asp Ser Trr Leu Phe 13 e Leu VaJ Phe Asp Tyr GJy Glu Arc 
2tC * 6 i - C 

Ser Ser Val Leu Ser Glu Ala Pro Ala Trp Gin Thr Pro Gly Ser Gly 
275 280 285 

Glu Trp Leu Cys Arg Gin Asp Cys Phe Ser Gly Tyr Glu Phe Gly Phe 
290 295 300 



Asn Leu Arg Thr Arg Arg Leu Cys Arg Gin Val Leu Met Phe His Tyr 
305 ~ 310 315 320 

Leu Gly Val Leu Ala Gly Ser Ser Gly Ala Asn Asp Ala Pro Ala Leu 
325 330 335 

lie Ser Arg Leu Leu Leu Asp Tyr Arg Glu Ser Pro Ser Leu Ser Leu 

340 345 350 



Leu Glu Asn Val His Gin Val Ala Tyr Glu Ser Asp Gly Thr Ser Cys 
355 360 365 



Ala Leu Pro Ala Leu Ala Leu Gly Trp Gin Thr Phe Thr Pro Pro Thr 
370 375 380 



Leu Ser Ala Trp Gin Thr Arg Asp Asp Met Gly Lys Leu Ser Leu Leu 
385 390 395 400 

Gin Pro Tyr Gin Leu Val Asp Leu Asn Gly Glu Gly Val Val Gly lie 
405 410 415 

Leu Tyr Gin Asp Ser Gly Ala Trp Trp Tyr Arg Glu Pro Val Arg Gin 
420 425 430 

Ser Gly Asp Asp Pro Asp Ala Val Thr Trp Gly Ala Ala Ala Ala Leu 
435 440 445 

Pro Thr Met Pro Ala Leu His Asn Ser Gly lie Leu Ala Asp Leu Asn 
450 455 460 

Gly Asp Gly Arg Leu Glu Trp Val Val Thr Ala Pro Gly Val Ala Gly 
465 470 475 480 

Met Tyr Asp Arg Thr Pro Gly Arg Asp Trp Leu His Phe Thr Pro Leu 
485 490 495 

Ser Ala Leu Pro Val Glu Tyr Ala His Pro Lys Ala Val Leu Ala Asp 
500 505 510 

lie Leu Gly Ala Gly Leu Thr Asp Met Val Leu lie Gly Pro Arg Ser 
515 520 525 

Val Arg Leu Tyr Ser Gly Lys Asn Asp Gly Trp Asn Lys Gly Glu Thr 
530 535 540 

Val Gin Gin Thr Glu Arg Leu Thr Leu Pro Val Pro Gly Val Asp Pro 
545 550 555 560 

Arg Thr Leu Val Ala Phe Ser Asp Met Ala Gly Ser Gly Gin Gin His 

565 570 575 

^ heu Thr Gju Val Arc A J a Asn Giy V&J Arc T'Vj Trp Pre Asr. Leu GJy 

580 585 590 

His Gly Arg Phe Gly Gin Pro Val Asn lie Pro Gly Phe Ser Gin Ser 
595 600 605 

V al Thr Thr Phe~ Asn Pro Asp Glu Il e~~ Leu Leu Ala Asp -^T hr Asp Gly 

610 615 620 

Ser Gly Thr Thr Asp Leu lie Tyr Ala Met Ser Asp Arg Leu Val lie 
625 630 635 640 

Tyr Phe Asn Gin Ser Gly Asn Tyr Phe Ala Glu Pro His Thr Leu Leu 
645 650 655 

Leu Pro Lys Gly Val Arg Tyr Asp Arg Thr Cys Ser Leu Gin Val Ala 
660 665 670 



Asp lie Gin Gly Leu Gly Val Pro Ser Leu Leu Leu Thr Val Pro His 
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675 680 685 

Val Ala Pro His His Trp Val Cys His Leu Ser Ala Asp Lys Pro Trp 
690 695 700 

Leu Leu Asn Gly Met Asn Asn Asn Met Gly Ala Arg His Ala Leu His 
705 710 715 720 

Tyr Arg Ser Ser Val Gin Phe Trp Leu Asp Glu Lys Ala Glu Ala Leu 
725 730 735 

Ala Ala Gly Ser Ser Pro Ala Cys Tyr Leu Pro Phe Thr Leu His Thr 
740 745 750 

Leu Trp Arg Ser Val Val "Gin Asp Glu lie Thr Gly Asn Arg Leu Val 
755 760 765 

Ser Asp Val Leu Tyr Arg His Gly Val Trp Asp Gly Gin Glu Arg Glu 
770 775 780 

Phe Arg Gly Phe Gly Phe Val Glu lie Arg Asp Thr Asp Thr Leu Ala 
785 790 795 800 

Ser Gin Gly Thr Ala Thr Glu Leu Ser Met Pro Ser Val Ser Arg Asn 
805 810 815 

Trp Tyr Ala Thr Gly Val Pro Ala Val Asp Glu Arg Leu Pro Glu Thr 
820 825 830 

Tyr Trp Gin Asn Asp Ala Ala Ala Phe Ala Asp Phe Ala Thr Arg Phe 
835 840 845 

Thr Val Gly Ser Gly Glu Asp Glu Gin Thr Tyr Thr Pro Asp Asp Ser 
850 855 860 

Lys Thr Phe Trp Leu Gin Arg Ala Leu Lys Gly lie Leu Leu Arg Ser 
865 870 875 880 

GJu Leu Tyr GJy A3 £ Asp GJy Ser Se: C-Jr. A J a Asp 3 J e Pre Tyi Ser 
88b 890 895 

Val Thr Glu Ser Arg Pro Gin Val Arg Leu Val Glu Ala Asn Gly Asp 
900 905 910 

Tyr Pro Val Val Trp Pro Met Gly Ala Glu Ser Arg Thr Ser Val Tyr 
9X5 9-2X1 : 9-2-5 



Glu Arg Tyr His Asn Asp Pro Gin Cys Gin Gin Gin Ala Val Leu Leu 

930 935 940 

Ser Asp Glu Tyr Gly Phe Pro Leu Arg Gin Val Ser Val Asn Tyr Pro 

945 950 955 960 

Arg Arg Pro Pro Ser Ala Asp Asn Pro Tyr Pro Ala Ser Leu Pro Ala 

965 970 975 



Thr Leu Phe Ala Asn Ser Tyr Asp Glu Gin Gin Gin lie Leu Arg Leu 
980 985 990 



• 



r 



-73- 



Gly Leu Gin Gin Ser Ser Ala His His Leu Val Ser Leu Ser Glu Gly 
995 1000 1005 

His Trp Leu Leu Gly Leu Ala Glu Ala Ser Arg Asp Asp Val Phe Thr 
1010 1015 1020 

Tyr Ser Ala Asp Asn Val Pro Glu Gly Gly Leu Thr Leu Glu His Leu 
025 1030 1035 1040 

Leu Ala Pro Glu Ser Leu Val Ser Asp Ser Gin Val Gly Thr Leu Ala 
1045 1050 1055 

Gly Gin Gin Gin Val Trp Tyr Leu Asp Ser Gin Asp Val Ala Thr Val 
1060 1065 1070 

Ala Ala Pro Pro Leu Pro Pro Lys Val Ala Phe lie Glu Thr Ala Val 
1075 1080 1085 

Leu Asp Glu Gly Met Val Ser Ser Leu Ala Ala Tyr lie Val Asp Glu 
1090 1095 1100 

His Leu Glu Gin Ala Gly Tyr Arg Gin Ser Gly Tyr Leu Phe Pro Arg 
105 1110 1115 1120 

Gly Arg Glu Ala Glu Gin Ala Leu Trp Thr Gin Cys Gin Gly Tyr Val 
1125 1130 1135 

Thr Tyr Ala Gly Ala Glu His Phe Trp Leu Pro Leu Ser Phe Arg Asp 
1140 1145 1150 

Ser Met Leu Thr Gly Pro Val Thr Val Thr Arg Asp Ala Tyr Asp Cys 
1155 1160 1165 

Val lie Thr Gin Trp Gin Asp Ala Ala Gly lie Val Thr Thr Ala Asp 
1170 1175 1180 

Tyr Asp Trp Arg Phe Leu Thr Pro Val Arg Val Thr Asp Pro Asn Asp 

18 5 119 0 USE 1200 

Asn Leu Gin Ser Val Thr Leu Asp Ala Leu Gly Arg Val Thr Thr Leu 
1205 1210 1215 

Arg Phe Trp Gly Thr Glu Asn Gly He Ala Thr Gly Tyr Ser Asp Ala 
1220 1225 1230 

Thr Leu Ser Val Pro Asp Gly Ala Ala Ala Ala Leu Ala Leu Thr Ala 
1235 1240 1245 

Pro Leu Pro Val Ala Gin Cys Leu Val Tyr Val Thr Asp Ser Trp Gly 
1250 1255 1260 

Asp Asp Asp Asn Glu Lys Met Pro Pro His Val Val Val Leu Ala Thr 
265 1270 1275 1280 



Asp Arg Tyr 



Asp Ser Asp Thr Gly Gin Gin Val Arg Gin Gin Val Thr 
1285 1290 1295 
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Phe Ser Asp Gly Phe Gly Arg Glu Leu Gin Ser Ala Thr Arg Gin Ala 
1300 1305 1310 

Glu Gly Asn Ala Trp Gin Arg Gly Arg Asp Gly Lys Leu Val Thr Ala 
1315 1320 1325 

Ser Asp Gly Leu Pro Val Thr Val Ala Thr" Asn Phe Arg Trp Ala Val 
1330 1335 1340 

Thr Gly Arg Ala Glu Tyr Asp Asn Lys Gly Leu Pro Val Arg Val Tyr 
345 " 1350 1355 1360 

Gin Pro Tyr Phe Leu Asp Ser Trp Gin Tyr Val Ser Asp Asp Ser Ala 
1365 1370 1375 

Arg Gin Asp Leu Tyr Ala Asp Thr His Phe Tyr Asp Pro Thr Ala Arg 
^1380 1385 1390 

Glu Trp Gin Val lie Thr Ala Lys Gly Glu Arg Arg Gin Val Leu Tyr 
1395 1400 1405 

Thr Pro Trp Phe Val Val Ser Glu Asp Glu Asn Asp Thr Val Gly Leu 
1410 1415 1420 

Asn Asp Ala Ser 
425 
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(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 973 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY : Linear 

(ii) MOLECULE TYPE: PROTEIN (SepC) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 



Met Ser Thr Ser Leu Phe Ser Ser Thr Pro Ser Val Ala Val Leu Asp 
1 5 10 15 

Asn Arg Gly Leu Leu Val Arg Glu Leu Gin Tyr Tyr Arg His Pro Asp 
20 25 30 

Thr Pro Glu Glu Thr Asp Glu Arg lie Thr Cys His Gin His Asp Glu 
35 40 45 

Arg Gly Ser Leu Ser Gin Ser Ala Asp Pro Arg Leu His Ala Ala Gly 
50 55 60 

Leu Thr Asn Phe Thr Tyr Leu Asn Ser Leu Thr Gly Thr Val Leu Gin 
65 70 75 80 

Ser Val Ser Ala Asp Ala Gly Thr Ser Leu Glu Leu Ser Asp Ala Ala 
85 90 95 

Gly Arg Ala Phe Leu Ala Val Thr Gly Ala Gly Thr Glu Asp Ala Val 
100 105 110 

Thr Arg Thr Trp Gin Tyr Glu Asp Asp Thr Leu Pro Gly Arg Pro Leu 
115 120 125 

Ser lie Thr Glu Gin Val Thr Gly Glu Ala Ala Gin lie Thr Glu Arg 

J 3 C I 3 i I4C 

Phe Val Tyr Ala Gly Asn Thr Asp Ala Glu Lys lie Leu Asn Leu Ala 
145 150 155 160 

Gly Gin Cys Val Ser His Tyr Asp Thr Ala Gly Leu Val Gin Thr Asp 
165 170 175 



Ser lie Ala Leu Ser Gly Val Pro Leu Ala Val Thr Arg Gin Leu Leu 
180 185 190 

Pro Asp Ala Ala Gly Ala Asn Trp Met Gly Glu Asp Ala Ser Ala Trp 
195 200 205 

Asn Asp Leu Leu Asp Gly Glu Thr Phe Phe Thr Gin Thr His Ala Asp 
210 215 220 

Ala Thr Gly Ala Val Leu Ser lie Thr Asp Ala Lys Gly Asn Leu Gin 
225 * 230 235 240 
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Arg Val Ala Tyr Asp Val Ala Gly Leu Leu Ser Gly Ser Trp Leu Thr 
245 250 255 

Leu Lys Asp Gly Thr Glu Gin Val lie Val Ala Ser Leu Thr Tyr Ser 
260 265 270 

Ala Ala Gly Lys Lys Leu Arg Glu Glu His Gly Asn Gly Val Val Thr 
275 280 285 

Ser Tyr lie Tyr Glu Pro Glu Thr Gin Arg Leu Thr Gly lie Lys Thr 
290 295 300 

Glu Arg Pro Ser Gly His Val Ala Gly Ala Lys Val Leu Gin Asp Leu 
305 ' 310 315 320 

Arg Tyr Thr Tyr Asp Pro Val Gly Asn Val Leu Ser Val Asn Asn Asp 
325 330 335 

Ala Glu Glu Thr Arg Phe Trp Arg Asn Gin Lys Val Val Pro Glu Asn 
340 345 350 

Thr Tyr lie Tyr Asp Ser Leu Tyr Gin Leu Val Ser Ala Thr Gly Arg 
355 360 365 

Glu Met Ala Asn Ala Gly Gin Gin Gly Asn Asp Leu Pro Ser Ala Thr 
370 375 380 

Ala Pro Leu Pro Thr Asp Ser Ser Ala Tyr Thr Asn Tyr Thr Arg Thr 
385 390 395 400 

Tyr Arg Tyr Asp Arg Gly Gly Asn Leu Thr Gin Met Arg His Ser Ala 
405 410 415 

Pro Ala Thr Asn Asn Asn Tyr Thr Thr Asp lie Thr Val Ser Asp Arg 
420 425 430 

Ser Asn Arc Ala Va3 Leu Ser Thr Leu Ala Glu Va3 Pro Ser Asp Va3 

4 1-- 4 4 C 4 4 i 

Asp Met Leu Phe Ser Ala Gly Gly His Gin Lys His Leu Gin Pro Gly 
450 455 460 

Gin Ala Leu Val Trp Thr Pro Arg Gly Glu Leu Gin Lys Val Thr Pro 
465 470 475 480 

Val Val Arg Asp~ Glry~G l y Ala Asp Asp S er Glu -Ser Tyr— Arg Tyr Asp - 
485 490 495 

Ala Gly Ser Gin Arg He He Lys Thr Gly Thr Arg Gin Thr Gly Asn 
500 505 510 

Asn Val Gin Thr Gin Arg Val Val Tyr Leu Pro Gly Leu Glu Leu Arg 
515 520 525 

He Met Ala Asn Gly Val Thr Glu Lys Glu Ser Leu Gin Val He Thr 
530 535 540 



Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu His Trp Glu He 
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545 550 555 560 

Gly Lys Pro Asp Asp Leu Asp Glu Asp Ser Val Arg Tyr Ser Tyr Asp 
565 570 575 

Asn Leu Val Gly Ser Ser Gin Leu Glu Leu Asp Arg Glu Gly Tyr Leu 
580 585 590 

lie Ser Glu Glu Glu Phe Tyr Pro Tyr Gly Gly Thr Ala Val Leu Thr 
595 600 605 

Ala Arg Ser Glu Val Glu Ala Asp Tyr Lys Thr lie Arg Tyr Ser Gly 
610 615 620 

Lys Glu Arg Asp Ala Thr Gly Leu Asp Tyr Tyr Gly Tyr Arg Tyr Tyr 
625 630 635 640 

Gin Pro Trp Ala Gly Arg Trp Leu Ser Thr Asp Pro Ala Gly Thr Val 
645 650 655 

Asp Gly Leu Asn Leu Phe Arg Met Val Arg Asn Asn Pro Val Thr Leu 
660 665 670 

Phe Asp Ser Asn Gly Arg lie Ser Thr Gly Gin Glu Ala Arg Arg Leu 
675 680 685 

Val Gly Glu Ala Phe Val His Pro Leu His Met Pro Val Phe Glu Arg 
690 695 700 

lie Ser Val Glu Arg Lys lie Ser Met Ser Val Arg Glu Ala Gly lie 
705 710 715 720 

Tyr Thr lie Ser Ala Leu Gly Glu Gly Ala Ala Ala Lys Gly His Asn 
725 730 735 

lie Leu Glu Lys Thr lie Lys Pro Gly Ser Leu Lys Ala lie Tyr Gly 
740 745 750 

Asp Lys A3 5 Glu Ser 13 e Leu GJy Leu A3 a Lys Arc Ser G3y Leu Va3 
7 S. t ",€(. 7 6b 

Gly Arg Val Gly Gin Trp Asp Ala Ser Gly Val Arg Gly lie Tyr Ala 
770 775 780 

His Asn Arg Pro Gly Gly Glu Asp Leu Val Tyr Pro Val Ser Leu Gin 
—7*5 7-90 1SS 800 



Asn Thr Ser Ala Asn Glu lie Val Asn Ala Trp lie Lys Phe Lys lie 
805 810 815 

lie Thr Pro Tyr Thr Gly Asp Tyr Asp Met His Asp lie lie Lys Phe 
820 825 830 

Ser Asp Gly Lys Gly His Val Pro Thr Ala Glu Ser Ser Glu Glu Arg 
835 840 845 



Gly Val Lys Asp Leu lie Asn Lys Gly Val Ala Glu Val Asp Pro Ser 
850 855 860 



Arg Pro Phe Glu 
865 

Val Asn Phe Val 



Asn Asp Asn Gly 
900 

Val Ala Met Val 
915 

Glu Leu Phe Asn 
930 

Trp Ser Gin Asp 
945 

Arg His Ala Glu 



Tyr Thr Ala Met 
870 

Pro Tyr Met Trp 
885 

Tyr Leu Gly Val 



His Gin Gly Glu 
920 

Phe Tyr Lys Ser 
93 5 

Phe Met Asp Arg 
950 

Leu Leu Asp Lys 
965 



Asn Val lie Arg 
875 

Glu His Glu His 
890 

Val Ala Ser Pro 
905 

Trp Thr Val Phe 



Thr Asn Thr Pro 
940 

Gly Lys Gly He 
955 

Arg Arg Val Met 
970 



His Gly Pro Gin 
880 

Asp Lys Val Val 
895 

Gly Pro Phe Pro 
910 

Asp Asn Ser Glu 
925 

Leu Pro Glu His 



Val Ala Thr Pro 
960 

Tyr 



By the authorised agents ^^fc^-^ UrnYtcd 
A. J. PARK & SON 
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H (180.6) 
H (180.1) 



H(l) 



H (190.6) 



B (178.7) 



B (160.7) 



B(13.9) 

B (17.5> 
B(18) 
H (20) 



H(21) 
H(24) 



H(1319) 

B (127.6) 
H (130.1) 
H (126.6) 



B (27.3) 
B (28.4) 
H (29.5) 



B (42.8) 




H (157.6) 
B (155.7)' 
B (154.5) 
B (154.1) 
B (153.6) 
B (151.6) 
B (147.6) 
H (146.3) 



B (138.6) 
H (137.8) 
B (137.6) 



H (97.6) 



B (96.4) 



H (90.9) 
B (91.2) 
H(92) 

H (93) H (92.3) 
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